# HOW CAN SECRETOMICS HELP UNRAVEL THE SECRETS OF PLANT-MICROBE INTERACTIONS?

EDITED BY: Delphine Vincent, Kim Marilyn Plummer, Peter Solomon, Marc-Henri Lebrun, Dominique Job and Maryam Rafiqi PUBLISHED IN: Frontiers in Plant Science

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-087-9 DOI 10.3389/978-2-88945-087-9

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **HOW CAN SECRETOMICS HELP UNRAVEL THE SECRETS OF PLANT-MICROBE INTERACTIONS?**

Topic Editors:

**Delphine Vincent,** Department of Economic Development, Jobs, Transport and Resources, Australia **Kim Marilyn Plummer,** La Trobe University, Australia **Peter Solomon,** The Australian National University, Australia **Marc-Henri Lebrun,** UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, France **Dominique Job,** Centre National de la Recherche - Scientifique, UMR5240 CNRS/University Claude Bernard Lyon 1/INSA/Bayer CropScience Joint Laboratory, Bayer CropScience, France **Maryam Rafiqi,** Jodrell Laboratory, Royal Botanic Gardens Kew, UK

Schematic representation of putative crosstalk via EVs at the plant-fungal interface (Samuel et al., 2016 Front. Plant Sci. 6:766., doi: 10.3389/fpls.2015.00766).

Cover page:

eYFP transformed isolate fungal pathogen Venturia inaequalis Vi1 growing on apple hypocotyl viewed with bright field microscopy and confocal fluorescent 14 days post inoculation (adapted from Shiller et al., 2016 Front. Plant Sci. 6:980., doi:10.3389/fpls.2015.00980)

Secretomics describes the global study of proteins that are secreted by a cell, a tissue or an organism, and has recently emerged as a field for which interest is rapidly growing. The term secretome was first coined at the turn of the millennium and was defined to comprise not only the native secreted proteins released into the extracellular space but also the components of machineries for protein secretion. Two secretory pathways have been described in fungi: i) the canonical pathway through which proteins bearing a N-terminal peptide signal can traverse the endoplasmic reticulum and Golgi apparatus, and ii) the unconventional pathway for proteins lacking a peptide signal. Protein secretion systems are more diverse in bacteria, in which types I to VII pathways as well as Sec or two-arginine (Tat) pathways have been described. In oomycete species, effectors are mostly small proteins containing an N-terminal signal peptide for secretion and additional C-terminal motifs such as RXLRs and CRNs for host targeting. It has recently been shown that oomycetes exploit non-conventional secretion mechanisms to transfer certain proteins to the extracellular environment. Other non-classical secretion systems involved in plant-fugal interaction include extracellular vesicles (EVs, Figure 1 from Samuel et al 2016 Front. Plant Sci. 6:766.).

The versatility of oomycetes, fungi and bacteria allows them to associate with plants in many ways depending on whether they are biotroph, hemibiotroph, necrotroph, or saprotroph. When interacting with a live organism, a microbe will invade its plant host and manipulate its metabolisms either detrimentally if it is a pathogen or beneficially if it is a symbiote. Deciphering secretomes became a crucial biological question when an increasing body of evidence indicated that secreted proteins were the main effectors initiating interactions, whether of pathogenic or symbiotic nature, between microbes and their plant hosts. Secretomics may help to contribute to the global food security and to the ecosystem sustainability by addressing issues in i) plant biosecurity, with the design of crops resistant to pathogens, ii) crop yield enhancement, for example driven by arbuscular mycorrhizal fungi helping plant hosts utilise phosphate from the soil hence increase biomass, and iii) renewable energy, through the identification of microbial enzymes able to augment the bio-conversion of plant lignocellulosic materials for the production of second generation biofuels that do not compete with food production.

To this day, more than a hundred secretomics studies have been published on all taxa and the number of publications is increasing steadily. Secretory pathways have been described in various species of microbes and/or their plant hosts, yet the functions of proteins secreted outside the cell remain to be fully grasped. This Research Topic aims at discussing how secretomics can assist the scientists in gaining knowledge about the mechanisms underpinning plant-microbe interactions.

**Citation:** Vincent, D., Plummer, K. M., Solomon, P., Lebrun, M-H., Job, D., Rafiqi, M., eds. (2017). How Can Secretomics Help Unravel the Secrets of Plant-Microbe Interactions? Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-087-9

# Table of Contents

*06 Editorial: How Can Secretomics Help Unravel the Secrets of Plant-Microbe Interactions?*

Delphine Vincent, Kim M. Plummer, Peter S. Solomon, Marc-Henri Lebrun, Dominique Job and Maryam Rafiqi

### **Chapter 1: IN VITRO SECRETOMICS**

*09 Unraveling the* **in Vitro** *Secretome of the Phytopathogen* **Botrytis Cinerea** *to Understand the Interaction with Its Hosts*

Raquel González-Fernández, José Valero-Galván, Francisco J. Gómez-Gálvez and Jesús V. Jorrín-Novo

*16 Beyond Plant Defense: Insights on the Potential of Salicylic and Methylsalicylic Acid to Contain Growth of the Phytopathogen* **Botrytis Cinerea**

Cindy Dieryckx, Vanessa Gaudin, Jean-William Dupuy, Marc Bonneu, Vincent Girard and Dominique Job

*31 Temperature Modulates the Secretome of the Phytopathogenic Fungus*  **Lasiodiplodia theobromae**

Carina Félix, Ana S. Duarte, Rui Vitorino, Ana C. L. Guerreiro, Pedro Domingues, António C. M. Correia, Artur Alves and Ana C. Esteves

### **Chapter 2: PROTEASES**

*43 The Battle in the Apoplast: Further Insights into the Roles of Proteases and Their Inhibitors in Plant–Pathogen Interactions*

Mansoor Karimi Jashni, Rahim Mehrabi, Jérôme Collemare, Carl H. Mesarich and Pierre J. G. M. de Wit

*50 Extracellular Peptidases of the Cereal Pathogen* **Fusarium Graminearum** Rohan G. T. Lowe, Owen McCorkelle, Mark Bleackley, Christine Collins, Pierre Faou, Suresh Mathivanan and Marilyn Anderson

### **Chapter 3: BIOINFORMATICS AND PREDICTION TOOLS**

*63 Evaluation of Secretion Prediction Highlights Differing Approaches Needed for Oomycete and Fungal Effectors*

Jana Sperschneider, Angela H. Williams, James K. Hane, Karam B. Singh and Jennifer M. Taylor

*77 Kingdom-Wide Analysis of Fungal Small Secreted Proteins (SSPs) Reveals Their Potential Role in Host Association*

Ki-Tae Kim, Jongbum Jeon, Jaeyoung Choi, Kyeongchae Cheong, Hyeunjeong Song, Gobong Choi, Seogchan Kang and Yong-Hwan Lee

*90 Common Protein Sequence Signatures Associate with* **Sclerotinia Borealis** *Lifestyle and Secretion in Fungal Pathogens of the* **Sclerotiniaceae**

Thomas Badet, Rémi Peyraud and Sylvain Raffaele

### **Chapter 4: EXOSOMES**

*107 Extracellular Vesicles Including Exosomes in Cross Kingdom Regulation: A Viewpoint from Plant-Fungal Interactions*

Monisha Samuel, Mark Bleackley, Marilyn Anderson and Suresh Mathivanan

### **Chapter 5: EFFECTORS**

*112 Candidate Effector Proteins of the Necrotrophic Apple Canker Pathogen* **Valsa Mali** *Can Suppress BAX-Induced PCD*

Zhengpeng Li, Zhiyuan Yin, Yanyun Fan, Ming Xu, Zhensheng Kang and Lili Huang


Cécile Lorrain, Arnaud Hecker and Sébastien Duplessis

*147 Functional Redundancy of Necrotrophic Effectors – Consequences for Exploitation for Breeding*

Kar-Chun Tan, Huyen T. T. Phan, Kasia Rybak, Evan John, Yit H. Chooi, Peter S. Solomon and Richard P. Oliver

### **Chapter 6: TOWARD BREEDING RESISTANCE TO PATHOGENS IN CROPS**

### *156 Surveying the Potential of Secreted Antimicrobial Peptides to Enhance Plant Disease Resistance*

Susan Breen, Peter S. Solomon, Frank Bedon and Delphine Vincent

*177 A Large Family of* **AvrLm6***-like Genes in the Apple and Pear Scab Pathogens,*  **Venturia inaequalis** *and* **Venturia pirina**

Jason Shiller, Angela P. Van de Wouw, Adam P. Taranto, Joanna K. Bowen, David Dubois, Andrew Robinson, Cecilia H. Deng and Kim M. Plummer

# Editorial: How Can Secretomics Help Unravel the Secrets of Plant-Microbe Interactions?

Delphine Vincent <sup>1</sup> \*, Kim M. Plummer <sup>2</sup> , Peter S. Solomon<sup>3</sup> , Marc-Henri Lebrun<sup>4</sup> , Dominique Job<sup>5</sup> and Maryam Rafiqi <sup>6</sup>

<sup>1</sup> Department of Economic Development, Jobs, Transport and Resources, AgriBio, La Trobe University, Bundoora, VIC, Australia, <sup>2</sup> Animal, Plant and Soil Sciences Department, AgriBio, La Trobe University, Bundoora, VIC, Australia, <sup>3</sup> Plant Sciences Division, Research School of Biology, The Australian National University, Canberra, ACT, Australia, <sup>4</sup> Institut National de la Recherche Agronomique-AgroParisTech, UMR INRA1290, Biologie et Gestion des Risques en Agriculture - Champignons Pathogènes des Plantes, Thiverval-Grignon, France, <sup>5</sup> Centre National de la Recherche-Scientifique, UMR5240 Centre Nationnal de la Recherche Scientifique/University Claude Bernard Lyon 1/INSA/Bayer CropScience Joint Laboratory, Bayer CropScience, Lyon, France, <sup>6</sup> Jodrell Laboratory, Royal Botanic Gardens, Kew, London, UK

Keywords: secretome, secretomics, pathogenic fungi, extracellular proteins, virulence factors, protein effectors, host-fungi interactions, diseases

#### Edited by:

Choong-Min Ryu, Korea Research Institute of Bioscience and Biotechnology, South Korea

#### Reviewed by:

Steven Whitham, Iowa State University, USA Remco Stam, Technische Universität München, Germany

> \*Correspondence: Delphine Vincent comtasr@yahoo.fr

#### Specialty section:

This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science

Received: 14 October 2016 Accepted: 11 November 2016 Published: 30 November 2016

#### Citation:

Vincent D, Plummer KM, Solomon PS, Lebrun M-H, Job D and Rafiqi M (2016) Editorial: How Can Secretomics Help Unravel the Secrets of Plant-Microbe Interactions? Front. Plant Sci. 7:1777. doi: 10.3389/fpls.2016.01777

### **How Can Secretomics Help Unravel the Secrets of Plant-Microbe Interactions?**

Secretomics describes the global study of proteins that are secreted by a cell, a tissue or an organism, and has recently emerged as a field for which interest is rapidly growing. The versatility of oomycetes, fungi, and bacteria allows them to associate with plants in many ways depending on whether they grow as a biotroph, hemibiotroph, necrotroph, or saprotroph. When interacting with a live organism, a microbe will invade its plant host and manipulate its metabolism either detrimentally if it is a pathogen or beneficially if it is a symbiont. Deciphering secretomes became a crucial biological question when an increasing body of evidence indicated that secreted proteins were the main effectors initiating interactions, whether of pathogenic or symbiotic nature, between microbes and their plant hosts.

This Research Topic aims at discussing how secretomics can assist scientists in gaining knowledge about the mechanisms underpinning plant-microbe interactions. The following aspects are discussed.

### IN VITRO SECRETOMICS

**Editorial on Research Topic**

González-Fernández et al. made an overview of the proteomics contribution to the study and knowledge of the extracellular secreted proteins of the fungal phytopathogen Botrytis cinerea. They hypothesized on the putative functions of these secreted proteins, and their connection to the biology of the B. cinerea interaction with its hosts.

Dieryckx et al. analyzed the inhibitory effect of salicylic acid (SA), a major plant hormone, on in vitro growth of B. cinerea. Comparative proteomics of intracellular and secreted proteomes revealed several mechanisms that could potentially account for the observed growth inhibition, notably pH regulation, metal homeostasis, mitochondrial respiration, ROS accumulation and cell wall remodeling.

Félix et al. explored the interesting case of a plant fungal pathogen Lasiodiplodia theobromae which evolves into a human pathogen strain when exposed to human body temperature. They show that both strains are cytotoxic to mammalian cells but while the environmental strain CAA019 is cytotoxic mainly at 25◦C, the clinical strain CBS339.90 is cytotoxic mainly at 30 and 37◦C. They demonstrate that temperature modulates the secretome of L. theobromae, which may be associated with host-specificity requirements.

### PROTEASES

Karimi Jashni et al. reviewed the recent advances on proteases and protease inhibitors (PIs) involved in fungal virulence and plant defense. They show that proteases and PIs from plants and their fungal pathogens play an important role in the arms race between plants and pathogens, which has resulted in coevolutionary diversification and adaptation shaping pathogen lifestyles.

Lowe et al. used a systems biology approach comprising genome analysis, transcriptomics, and label-free quantitative proteomics to characterize peptidases deployed by the cereal pathogen Fusarium graminearum (Fgr) during growth. A high resolution mass spectrometry-based proteomics analysis defined the extracellular proteases secreted by Fgr. A meta-classification based on sequence features and transcriptional/translational activity in planta and in vitro provides a platform to develop control strategies that target Fgr peptidases.

### BIOINFORMATICS AND PREDICTION TOOLS

Sperschneider et al. assessed several secretion prediction tools on experimentally validated fungal and oomycete effectors. For a set of fungal SwissProt protein sequences, SignalP 4 and the neural network predictors of SignalP 3 (D-score) and SignalP 2 perform best. Yet, assessment of subcellular localization predictors indicates that effectors targeted to the host cytoplasm are often predicted as being not extracellular. This limits the reliability of secretion predictions that depend on these tools.

Kim et al. identified small secreted proteins (SSPs) in 136 fungal species from data archived in the Fungal Secretome Database via a refined secretome workflow. They observed that species that are intimately associated with host cells, such as biotrophs and symbionts, usually have higher proportion of species-specific SSPs (SSSPs) than hemibiotrophs and necrotrophs, but the latter groups displayed higher proportions of secreted enzymes.

Badet et al. analyzed amino acids usage and intrinsic protein disorder in alignments of groups of orthologous proteins from three Sclerotiniaceae species. Enrichment in Thr, depletion in Glu and Lys, and low disorder frequency in hot loops are significantly associated with proteins from S. borealis. The results also highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase.

### EXOSOMES

Samuel et al. discussed current knowledge on extracellular vesicles (EVs) in the context of human-fungal interactions and their potential roles in plant-fungal interactions. They propose that the molecular cargo present in EVs is specific to the type of insult or infection.

### EFFECTORS

Canker caused by the Ascomycete Valsa mali is the most destructive disease of apple in Eastern Asia. Li et al. identified and characterized the V. mali repertoire of candidate effector proteins (CEPs). Based on transient over-expression in Nicotiana benthamiana performed for 70 randomly selected CEPs, seven of them were shown to significantly suppress BAX-induced programmed cell death. Furthermore, targeted deletion of the genes encoding these proteins resulted in a significant reduction of virulence.

Mesarich et al. reviewed the diverse roles of several effectors of plant-associated organisms corresponding to repeat-containing proteins (RCPs) that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. This analysis draws attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity.

Lorrain et al. reviewed the current status of the poplar rust fungus secretome and prediction of candidate effectors from this species. They stress that effector mining in the poplar rust fungus relies both on the quality of input data (i.e., gene annotation and gene family analyses) and on several qualitative and subjective criteria.

Tan et al. characterized three protein effectors, namely SnToxA, SnTox1, and SnTox3, which are involved in Septoria nodorum blotch (Parastagonospora nodorum). From deletion analyses they conclude that the secreted necrotrophic effectors explain a very large part of the disease response of wheat germplasm and that this method of resistance breeding promises to further reduce the impact of this devastating disease.

### TOWARD BREEDING RESISTANCE TO PATHOGENS IN CROPS

Breen et al. reviewed the present knowledge on antimicrobial peptides (AMPs) that are involved in the innate immune system against pathogen attacks. They highlight that such AMPs could offer a solution to combat microbial disease in crops by exploring not only the plant-derived AMPs, but also non-plant AMPs produced by bacteria, fungi, oomycetes, or animals. The greatest challenge remains the functional validation of candidate AMPs in planta through transgenic experiments, particularly introducing AMPs into crops.

AvrLm6-like genes are present as large families (>15 members) in all sequenced strains of Venturia inaequalis (apple scab pathogen) and V. pirina (European pear scab pathogen) (Shiller et al.). These genes are located in gene-poor regions of the genomes, and mostly in close proximity to transposable elements, which may explain the expansion of these gene families. An AvrLm6 homolog from V. inaequalis that is up-regulated during infection was shown to be localized to the sub-cuticular stroma during biotrophic infection of apple hypocotyls.

### CONCLUSION

Publications in this Research Topic highlight the range of activities currently being undertaken by microbial secretomics researchers and reflect the current state of the field. This topic is still in its infancy and much remains to be accomplished at the experimental level (e.g., preparation of the secretomes, methodologies of secretomics, mechanisms of protein secretion, functional validation of the secreted proteins). The findings reported here are however very encouraging as they emphasize the major roles of microbial secretomes enabling interactions with plant hosts. It is anticipated that this knowledge will be useful for characterizing genes encoding secreted proteins as novel targets for crop breeding.

### AUTHOR CONTRIBUTIONS

DJ wrote the draft. DV, KP, PS, ML, and MR edited and contributed to the initial draft. DV compiled, formatted, and submitted the final version.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Vincent, Plummer, Solomon, Lebrun, Job and Rafiqi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Unraveling the *in vitro* secretome of the phytopathogen *Botrytis cinerea* to understand the interaction with its hosts

*Raquel González-Fernández1\*, José Valero-Galván1, Francisco J. Gómez-Gálvez2 and Jesús V. Jorrín-Novo2*

*<sup>1</sup> Department of Chemical and Biological Science, Biomedicine Science Institute, Autonomous University of Ciudad Juárez, Ciudad Juárez, México, <sup>2</sup> Agroforestry and Plant Biochemistry and Proteomics Research Group, Department of Biochemistry and Molecular Biology, University of Córdoba, Agrifood Campus of International Excellence (ceiA3), Córdoba, Spain*

*Botrytis cinerea* is a necrotrophic fungus with high adaptability to different environments and hosts. It secretes a large number of extracellular proteins, which favor plant tissue penetration and colonization, thus contributing to virulence. Secretomics is a proteomics sub-discipline which study the secreted proteins and their secretion mechanisms, socalled secretome. By using proteomics as experimental approach, many secreted proteins by *B. cinerea* have been identified from *in vitro* experiments, and belonging to different functional categories: (i) cell wall-degrading enzymes such as pectinesterases and endo-polygalacturonases; (ii) proteases involved in host protein degradation such as an aspartic protease; (iii) proteins related to the oxidative burst such as glyoxal oxidase; (iv) proteins which may induce the plant hypersensitive response such as a ceratoplatanin domain-containing protein; and (v) proteins related to production and secretion of toxins such as malate dehydrogenase. In this mini-review, we made an overview of the proteomics contribution to the study and knowledge of the *B. cinerea* extracellular secreted proteins based on our current work carried out from *in vitro* experiments, and recent published papers both *in vitro* and *in planta* studies on this fungi. We hypothesize on the putative functions of these secreted proteins, and their connection to the biology of the *B. cinerea* interaction with its hosts.

Keywords: *Botrytis cinerea*, secretomics, plant pathogenic fungi, fungal secretome, fungi–plant interactions

### INTRODUCTION

Phytopathogenic fungi can invade and colonize their plant host to obtain the nutrients because they are able to secrete a set of extracellular proteins and other metabolites (Gonzalez-Fernandez et al., 2010). The secretome has been recently defined as "the global group of secreted proteins into the extracellular space by a cell, tissue, cell, organ, or organism at any given time and conditions through known and unknown secretory mechanisms involving constitutive and regulated secretory organelles" (Agrawal et al., 2013). In the case of phytopatogenic fungi, the secretome consists of pathogenicity and virulence factors, which favor host tissue penetration and colonization in the susceptible plant (Girard et al., 2013). Due to the need to study these secretomes, the term secretomics emerged, and was defined as "the global study of proteins that are secreted by a

#### *Edited by:*

*Delphine Vincent, Department of Environment and Primary Industries, Australia*

#### *Reviewed by:*

*Dario Cantu, University of California, Davis, USA Dominique Job, Centre National de la Recherche Scientifique, France*

> *\*Correspondence: Raquel González-Fernández raquel.gonzalez@uacj.mx*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 20 May 2015 Accepted: 24 September 2015 Published: 09 October 2015*

#### *Citation:*

*González-Fernández R, Valero-Galván J, Gómez-Gálvez FJ and Jorrín-Novo JV (2015) Unraveling the in vitro secretome of the phytopathogen Botrytis cinerea to understand the interaction with its hosts. Front. Plant Sci. 6:839. doi: 10.3389/fpls.2015.00839* cell, a tissue or an organism" (Vincent and Bedon, 2013). Many studies have been performed by using both classical approaches and modern–omic techniques through *in planta* experiments to unravel the plant–fungus interaction mechanisms (Allwood et al., 2008; Bhadauria et al., 2009; Tan et al., 2009; Bhadauria et al., 2010; Quirino et al., 2010; Afroz et al., 2011; Dean et al., 2012). As a proteomics sub-discipline, secretomics has contributed significantly to the study of the phytopathogenic fungus secretome by using *in vitro* experiments (González-Fernández and Jorrin-Novo, 2010, 2012; Girard et al., 2013; Vincent and Bedon, 2013). *Botrytis cinerea* Pers. Fr. (teleomorph *Botryotinia fuckeliana* (de Bary) Whetzel) is a necrotrophic pathogen with a wide host range, including pre- and post-harvest plant species, and it causes important economic losses in agriculture (Elad et al., 2007). The *B. cinerea* infection process includes host surface penetration, host tissue killing and primary lesion formation, lesion expansion, tissue maceration, and conidiation (van Kan, 2006). All these stages are mainly achieved by producing secreted proteins and other compounds, including the secretion of cell wall-degrading enzymes (CWDEs), the production of non-specific phytotoxic metabolite (botrydial and botcinolides), the boost of an oxidative burst because of reactive oxygen species (ROS) accumulation, and molecules which induce the plant hypersensitive response (HR; Williamson et al., 2007). In the last years, the use of complementary gel-based and gel-free proteomic approaches has provided important findings in the understanding of *B. cinerea* pathogenicity and virulence in *in vitro* (Gonzalez-Fernandez and Jorrin-Novo, 2012; Gonzalez-Fernandez et al., 2013; González-Fernández et al., 2014) and *in planta* (Shah et al., 2012) experiments.

### THE *B. cinerea* SECRETOME UNRAVELED FROM *IN VITRO* PROTEOMIC STUDIES

In fungi, extracellular proteins may be secreted by both the classical pathway, via endoplasmatic reticulum and the Golgi complex, and unconventional export route non-mediated by ER-derived (Girard et al., 2013; Vincent and Bedon, 2013). The *B. cinerea* secretome analysis by using Fungal Secretome Database (FSD; http://fsd.snu.ac.kr/website) showed that 16% of the gene products are predicted to be secreted proteins by the canonical pathway, in which proteins have an N-terminal peptide signal (Choi et al., 2010). This percentage should be increased because it is suggested that various kinds of nonclassical export pathways may exist in *B. cinerea* (Jain et al., 2008).

Most studies about *B. cinerea* secretome have been carried out through *in vitro* experiments, mainly because of two problems: (i) the fungal secretome is a complex analysis due to the ratio cell concentration fungus/host, and (ii) the genomic annotation quality for the two partners (Girard et al., 2013; Vincent and Bedon, 2013). To avoid the first difficulty, the *in vitro* experimental protocols try to simulate the *in vivo* conditions, where the fungus is cultured in the presence of more or lesspurified fractions of its plant host (Shah et al., 2009a,b; Espino et al., 2010; Fernandez-Acero et al., 2010; González-Fernández et al., 2014). With respect to the second difficulty, it is essential that the fungal and plant genomes be sequenced in order to distinguish between fungal and plant proteins (Girard et al., 2013; Vincent and Bedon, 2013).

In the last years, great efforts have been made to explain the *B. cinerea* secretome complexity and versatility using secretomics from *in vitro* experiments. One of the first studies showed that changes during the fruit ripening process seemed to have an important role in the latent infection activation, which is probably not only dependent on changes in the pectin esterification degree of the plant cell wall (Shah et al., 2009b). By the other hand, this fungus showed significant changes in the composition and relative abundance of secreted proteins that are specific to a particular growth condition (Shah et al., 2009a,b; Espino et al., 2010; Fernandez-Acero et al., 2010; González-Fernández et al., 2014). In the presence of favorable nutrient sources, the fungus increased its protein secretion (Fernandez-Acero et al., 2010). Another key point in the establishment of a successful infection is the extracellular secreted proteins during the spore germination on the plant surface. The early secretome composition was not as variable as it could be expected, and *B. cinerea* seems to secrete a common set of proteins during its germination regardless of growth condition, together with a lower number of proteins that are specifics for these growth conditions (Espino et al., 2010). The aspartic protease BcAP8, a α-amylase 1, and a cerato-platanin domain-containing protein (BcPls1) were detected in the six strains. Two different pectinesterases, one endo-polygalacturonase (PG), a glucan-1,3-β-glucosidase, a glucoamylase, a carboxypeptidase S1, and a choline dehydrogenase were detected in some of the six strains (González-Fernández et al., 2014). Moreover, some hypothetical proteins, which showed differences among strains, were identified in secretome (González-Fernández et al., 2014). An example was the predicted protein BC1G\_08642.1, which contains a carbohydrate-recognition domain similar to those ones included in plant and bacterial AB-toxins, glycosidases, or proteases (González-Fernández et al., 2014).

The pH was also shown to affect the *B. cinerea* secretome. Proteins related to proteolysis as the BcAP8, a family S53 protease, a serine-type carboxypeptidase, and a metalloprotease (Merops M35) were induced at pH 4 (similar to mature fruit environment), whereas at pH 6 (similar to leaf environment) most of the up-accumulated proteins were CWDEs (Li et al., 2012). These results were in concordance because ripe fruits generally have lower tissue pH and weakened cell walls, and they accumulate a lot of pathogenesis-related proteins for defense, so inducing protease secretion is more important than the CWDE secretion (Manteau et al., 2003). In contrast, for leaves and stem, which have higher tissue pH and harder cell walls, CWDEs are secreted in greater quantity than the proteases (Manteau et al., 2003). Therefore, pH present in each tissue could regulate the expression of secreted proteins in *B. cinerea* to activate the machinery required for invasion. Finally, metals present in the ambient as cooper, zinc, nickel, or cadmium, also modified the oxidoreductase production and CWDE secretion (Cherrad et al., 2012).

The comparison of these data suggests that *B. cinerea* secretes a common set of proteins as well as a pool of different ones, depending on the *in vitro* growth conditions and the strain, which were used, and making the secretome highly adaptive. Therefore, *B. cinerea* may greatly change the composition of the secreted protein set to satisfy the requirements of these different growing conditions. Next, we are going to discuss the importance of some *B. cinerea* secreted proteins, which have been identified by using a proteomic approach from *in vitro* experiments, and we hypothesize about their putative function in the *B. cinerea* interaction with its host (**Table 1**). All proteins, which are described in **Table 1**, except the pectinesterase BC1G\_06840.1, were predicted to be secreted by the classical secretion pathway (Espino et al., 2010; González-Fernández et al., 2014).

### Proteins Related to Carbohydrate Metabolism

Several enzymes could take part in the metabolism of β-1,3 glucans that are part of the fungal cell wall. This polysaccharide is secreted by *B. cinerea* in high amounts to the medium for which several functions have been proposed, including extracellular energy storage, or the adhesion of the conidia to plant surface during germination (Stahmann et al., 1992). A glucan β-1,3 glucosidase, a β-1,3-endoglucanase, and an exo-β-1,3-glucanase were secreted early in *B. cinerea* development (Espino et al.,

TABLE 1 | Secreted proteins discussed in this mini-review, which were identified by a proteomic approach from both *in vitro* and *in planta* studies, and which may be involved in the *B. cinerea*–host interaction.


<sup>a</sup>*Gene name of the different protein isoforms for the B05.10 strain (Botrytis cinerea sequencing project, Broad Institute), which were identified through in vitro experiments by a proteomic approach.*

<sup>b</sup>*Signal peptide prediction calculated by using the SignalP server (http://www*.*cbs*.*dtu*.*dk/services/SignalP/), indicating the identification (Yes) or not (No) of signal peptide.*

2010). The same β-1,3-endoglucanase and other glucanases were detected in the *B. cinerea* interaction with tomato (Shah et al., 2012). The degradation of β-1,3-glucans may contribute to activating the induction of the programmed cell death in plant cells by generating elicitors in the form of β-(1,3)(1,6)-oligomers (Espino et al., 2010).

### Cell Wall-degrading Enzymes

*Botrytis cinerea* is provided with a set of CWDEs able to degrade the cell wall to allow plant tissue colonization to get nutrients (Prins et al., 2000; Shah et al., 2009b; Gonzalez-Fernandez and Jorrin-Novo, 2012). Despite having a broad spectrum of host plant species, this fungus has predilection for dicotyledonous plants with cell walls rich in pectin, which usually is highly methyl-esterified, in order to protect them from fungal PGs and pectate lyases (Prins et al., 2000). Therefore, pectin-degrading enzymes, such as pectinesterases, PGs, and pectate lyases, have a very important role in the cell wall degradation and successful fungal invasion (Kars and van Kan, 2007; Zhang and van Kan, 2013). Recently, the *B. cinerea* BcDW1 genome was sequenced, and the complete genes of a large set of candidate secreted CWDEs were found, among which were found 19 PGs, 15 xyloglucanases, 10 cutinases, and 9 pectin/pectate lyases (Blanco-Ulate et al., 2013). These enzymes are categorized as carbohydrate-active enzymes (CAZymes), which are proteins that degrade, modify, or create glycosidic bonds, and that are also included in diverse glycoside hydrolase (GH) families (Kubicek et al., 2014). By a genome-wide transcriptional profiling analysis, 229 potentially secreted CAZymes were expressed in three different host infected with *B. cinerea* (Blanco-Ulate et al., 2014). These results suggest that *B. cinerea* targets analogous wall polysaccharide matrix on leaves and fruit, and may selectively attack host wall polysaccharide substrates depending on the host tissue.

Two pectinesterases, the endo-PG BcPG1 and the pectin lyase A, were evidenced in the *B. cinerea* secretome by using a combination of 1-DE-MALDI-TOF/TOF MS/MS and label-free shotgun nUPLC–MS<sup>E</sup> techniques (González-Fernández et al., 2014), as well as in the *B. cinerea* interaction with tomato (Shah et al., 2012). It is known that endo-PGs show not only an elevated genetic variation, and a specialization among them, but also a potential diversification, interacting directly with host defenses (Rowe and Kliebenstein, 2007). Two endo-PGs (BcPG1 and BcPG2) out of the six previously characterized have been reported to be required for full virulence (Choquer et al., 2007; Zhang et al., 2014). For example, the BcPG1 was only detected in five of six wild-type strains (González-Fernández et al., 2014), and in the early secretome (Espino et al., 2010). Moreover, the BcPG1 was secreted by the B05.10 strain grown in glucose, tomato, or kiwifruit, although it was almost absent in strawberry (Espino et al., 2010). This enzyme exemplifies the adaptable secretome nature depending on the strain specificity by its host, and the diversity of the *B. cinerea* armament resulting in an over-kill strategy. Thus, some proteins, which may be required for full virulence to attack one host, may differ from others needed to invade another different host.

### Proteases

A high amount of proteases have been identified in the *B. cinerea* secretome (Shah et al., 2009a; Espino et al., 2010; Fernandez-Acero et al., 2010; González-Fernández et al., 2014). It could be explained due to their role (Espino et al., 2010): (i) generating amino acids to sustain fungal growth; (ii) contributing to cell wall softening, and, therefore, to fungal hyphal penetration; and (iii) degrading the defense proteins which plants secrete against the pathogens. The high diversity of the proteases found may also be reflecting the diverse nature of their substrate. Plant proteins have a great variety of structures by nature, and it may be needed a diverse pool of protease for its degradation (Espino et al., 2010). One of the most abundant proteins secreted by *B. cinerea* is the BcAP8 (Shah et al., 2009a; Espino et al., 2010; Fernandez-Acero et al., 2010; González-Fernández et al., 2014). This protease was detected in the secretome of the six wild-type strains analyzed, varying in abundance depending on the strain (González-Fernández et al., 2014). Besides, its absence in the secretome of B05.10 strain caused the reduction of up to 90% in the secreted protease activity, a rise in the intensity of high molecular weight protein bands in the 1- DE profile, and a decrease, or even the disappearance of the smallest bands (Espino et al., 2010). However, in other study, there was no changes in virulence between the *Bcap8* knockout mutant and the wild-type strain B05.10 in tomato leaves and fruit (ten Have et al., 2010), and the BcAP8 was not detected in the proteomic analysis from *B. cinerea*–tomato interaction (Shah et al., 2012). The result indicated that BcAP8 might not be responsible for virulence in *B. cinerea*. The loss of BcAP8 activity in -*Bcap8* mutant may be partially compensated by expression of other genes from the *Bcap* family, or BcAP8 activity plays other important roles in the life cycle of the pathogen (Li et al., 2012), supporting the fact of the great adaptability of *B. cinerea*.

### Proteins Involved in Oxidative Burst

Other key mechanism in the *B. cinerea* invasion strategy is the active generation of an oxidative burst by the fungus during the first infection stages (van Kan, 2006; Choquer et al., 2007; Amselem et al., 2011). Several enzymes have been studied as potential ROS generators, such as a Cu–Zn superoxide dismutase (BcSOD1), whose mutation results in reduced virulence in several different hosts, and a glucose oxidase (BcGOD1; Rolke et al., 2004); however, none of them was found by proteomics (Espino et al., 2010; González-Fernández et al., 2014). There are numerous extracellular enzymes that have been previously characterized as potential ROS generators in fungi (Espino et al., 2010). An example is the glyoxal oxidase, which has proved to be an important determinant of cell morphology and virulence in *B. cinerea*, *Ustilago maydis*, or *Phanerochaete chrysosporium* (Soanes et al., 2008). This enzyme may also produce H2O2, together with other enzymes as superoxide dismutase (Soanes et al., 2008). Additionally, a quinoprotein glucose dehydrogenase and a cellobiose dehydrogenase were detected in the early secretome that may also generate ROS (Espino et al., 2010). The glyoxal oxidase and the cellobiose dehydrogenase were also detected in the *B. cinerea* interaction with tomato but only in a ripening inhibited mutant tomato (Shah et al., 2012).

### Proteins Involved in the Activation of Plant Defense Response

Pathogens produce molecules which can activate plant defense responses so-called HR (Jones and Dangl, 2006). Unlike biotrophs, necrotrophic fungi as *B. cinerea* takes benefit of the HR, generating dead tissue around the infected area for rapid colonization of their hosts (Govrin and Levine, 2000; Williamson et al., 2007; Hématy et al., 2009). One protein, which may help to induce HR in plants, may be the BcPls1. This protein is a member of the hydrophobin-like cerato-platanin family, which have been reported to be a secreted protein, acting as elicitors, and, in some cases, as pathogenicity factors (Gaderer et al., 2014; Baccelli, 2015). The BcPls1 was found to have high levels of secretion in the B05.10 strain when it was grown *in vitro* by using different media (Shah et al., 2009a,b; Espino et al., 2010), and *in planta B. cinerea*–tomato interaction (Shah et al., 2012), as well as when different wildtype strains were studied (González-Fernández et al., 2014), suggesting that this protein may play an important role in host–pathogen interactions. It was required for full virulence in *B. cinerea*, and induced necrosis in several hosts (Frías et al., 2011). Some HR symptoms were the induction of autofluorescence and ROS, the electrolyte leakage, and the cytoplasm shrinkage (Frías et al., 2011). All these observations may imply that the cerato-platanin proteins are recognized by the immune system of the plant, and this reconnaissance induces with the programmed death of the affected cells (Frías et al., 2011).

### Proteins Associated to the Toxin Secretion

Oxalic acid has been reported as a pathogenicity factor in *B. cinerea* (Lyon et al., 2007), and in the related necrotrophic fungus *Sclerotinia sclerotiorum* (Kim et al., 2008). Its physiological roles in pathogenesis include: (i) the enhancing of PG activity to promote cell wall degradation, (ii) suppression of the plant oxidative burst, (iii) the plant-protection enzymes inhibition, (iv) involvement in the pH signaling, (v) deregulation of stomatal guard cell closure, (vi) induction of apoptosis-like cell death, and (vii) alteration of the cellular redox status in the plant (Amselem et al., 2011). Oxalic acid secretion makes an optimal acidic environment for the pathogenicity factor expression and secretion in *B. cinerea*, such as CWDEs (Manteau et al., 2003), peptidases (ten Have et al., 2010), and for the phytotoxin biosynthesis (Durán-Patrón et al., 2004). Oxaloacetate is the precursor of oxalic acid. The malate dehydrogenase (MDH) catalyzes the reversible conversion of oxaloacetate and malate. A low-gene expression levels of fungal MDH, in combination with the absence of botrydial and dihydrobotrydial secretion, was found in a less virulent

*B. cinerea* strain than in the more virulent one (Fernandez-Acero et al., 2007). Fungal MDH was also found in the mycelium proteins of six *B. cinerea* wild-type strains with significant quantitative differences, being higher in the strains isolated from green material (whose pH > 6; González-Fernández et al., 2014). This implies that fungal MDH plays a key role in the biosynthesis of oxalic acid to produce a more appropriate, ecological niche for the fungal pathogenic activities.

### CONCLUSION AND FUTURE PERSPECTIVES

The understanding of how the *B. cinerea* secretome affects the interactions between this fungus and its host has become a hard task due to its adaptability at different growth conditions and plant species. *B. cinerea* populations manifest a significant phenotypic variability with respect to their level of aggressiveness, oxidative burst occurring during the infection, and toxin production (Prins et al., 2000; Elad et al., 2007). Considering the results from the different studies, this fungus seems to be efficiently adapted to their different host plants in terms of host preference rather than in a real host specialization (Choquer et al., 2007). Hosts and their parasites are implicated in an evolutionary fighting marked by an adaptation and a counter-adaptation of host defense and pathogen attack mechanisms. Thus, the selective influence perform by plants may affect the virulence factor evolution at population level (Choquer et al., 2007). Certain virulence factors can be important for one strain on one particular host species, but they might be dispensable on other host species, or they might be dispensable for a different strain.

Secretomics has supplied important advances in the identification of extracellular proteins, which are secreted by *B. cinerea*, and may be involved in the interaction with its host that result in a successful infection. So far, only the 10% of the secreted proteins, which were predicted to be involved in the classical export pathway from *B. cinerea* (Choi et al., 2010), has been identified by proteomic approaches (Gonzalez-Fernandez and Jorrin-Novo, 2012; González-Fernández et al., 2014). Due to its high adaptability to different hosts, more specific studies both *in vitro* and *in planta* need to be made for each host to discover the factors involved in the infection mechanisms. Finally, a very important aspect in secretomics is the knowledge of the secretion pathways (Girard et al., 2013; Vincent and Bedon, 2013). Therefore, other point for further studies may be the secretion pathway study, and the cataloging of the proteins according to their secretion mechanisms.

### ACKNOWLEDGMENTS

This publication has been supported by the Ministry of Education (Secretaría de Educación Pública, SEP) of the Federal Government of México, through the Teacher Professional Development Program (Programa para el Desarrollo Profesional Docente, para el Tipo Superior, PRODEP), and the Autonomous University of Ciudad Juárez (UACJ). We wish to thank the Spanish Ministry of Science and

### REFERENCES


Innovation (BotBank Project, EUI2008-03686), the Regional Government of Andalusia (Junta de Andalucía), and the University of Córdoba (AGR-0164: Agricultural and Plant Biochemistry and Proteomics Research Group) for their previous support.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 González-Fernández, Valero-Galván, Gómez-Gálvez and Jorrín-Novo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Beyond plant defense: insights on the potential of salicylic and methylsalicylic acid to contain growth of the phytopathogen Botrytis cinerea

Cindy Dieryckx 1 †, Vanessa Gaudin<sup>1</sup> † , Jean-William Dupuy <sup>2</sup> , Marc Bonneu<sup>2</sup> , Vincent Girard<sup>1</sup> \* and Dominique Job<sup>1</sup> \*

<sup>1</sup> Laboratoire Mixte UMR 5240, Plateforme de Protéomique, Centre National de la Recherche Scientifique, Lyon, France, <sup>2</sup> Plateforme Protéome, Centre de Génomique Fonctionnelle, Université de Bordeaux, Bordeaux, France

#### Edited by:

Vincenzo Lionetti, Sapienza Università di Roma, Italy

#### Reviewed by:

Robin Katrina Cameron, McMaster University, Canada Benedetta Mattei, Sapienza Università di Roma, Italy

#### \*Correspondence:

Vincent Girard vincent.girard@bayer.com; Dominique Job job.dominique@gmail.com † These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science

Received: 28 July 2015 Accepted: 29 September 2015 Published: 16 October 2015

#### Citation:

Dieryckx C, Gaudin V, Dupuy J-W, Bonneu M, Girard V and Job D (2015) Beyond plant defense: insights on the potential of salicylic and methylsalicylic acid to contain growth of the phytopathogen Botrytis cinerea. Front. Plant Sci. 6:859. doi: 10.3389/fpls.2015.00859 Using Botrytis cinerea we confirmed in the present work several previous studies showing that salicylic acid, a main plant hormone, inhibits fungal growth in vitro. Such an inhibitory effect was also observed for the two salicylic acid derivatives, methylsalicylic and acetylsalicylic acid. In marked contrast, 5-sulfosalicylic acid was totally inactive. Comparative proteomics from treated vs. control mycelia showed that both the intracellular and extracellular proteomes were affected in the presence of salicylic acid or methylsalicylic acid. These data suggest several mechanisms that could potentially account for the observed fungal growth inhibition, notably pH regulation, metal homeostasis, mitochondrial respiration, ROS accumulation and cell wall remodeling. The present observations support a role played by the phytohormone SA and derivatives in directly containing the pathogen. Data are available via ProteomeXchange with identifier PXD002873.

Keywords: salicylic acid, Botrytis cinerea, fungal growth, proteomics, secretomics

### INTRODUCTION

Filamentous fungi are the major plant pathogens that cause multi-millions of US dollars in pre- and post-harvest crop losses worldwide (Bolton et al., 2006). In particular Botrytis cinerea (Botrytis), a necrotrophic and polyphagous fungus, is able to infect over 200 plants corresponding mostly to flowering plants of temperate and subtropical regions (Mansfield, 1980; Elad, 1997; Williamson et al., 2007). The availability of molecular tools has considerably advanced our understanding of the infection strategies of this fungus (Hahn et al., 2014). Furthermore, its genome has been sequenced

**Abbreviations:** 2D, two dimensional; ACN, acetonitrile; ASA, acetylsalicylic acid; CHAPS, 3-[(3 cholamidopropyl)dimethylammonio]-1-propanesulfonate; CID, collision induced dissociation; CPP, cerato-platinin-related protein; GPI, glycosylphosphatidylinositol; HPLC, high-performance liquid chromatography; HR, hypersensitive response; ID, internal diameter; IPG, immobilized pH gradient; LC-MS/MS, liquid chromatography coupled to tandem mass spectrometry; MeSA, methylsalicylic acid; PCD, programmed cell death; ROS, reactive oxygen species; SA, salicylic acid; SABP, SA binding protein; SAR, systemic acquired resistance; SDS-PAGE, sodium dodecylsulfate polyacrylamide gel electrophoresis; SSA, 5-sulfosalicylic acid; TCA, trichloroacetic acid.

revealing over 16,000 protein-coding genes (Amselem et al., 2011; Staats and van Kan, 2012; Hahn et al., 2014). Hence, Botrytis is now a widely used fungal model, being among the top 10 fungal pathogens in molecular plant pathology (Dean et al., 2012), thus allowing to unravel genes accounting for pathogenicity (Amselem et al., 2011; Aguileta et al., 2012; Dean et al., 2012; Heard et al., 2015) and for the development of fungicides with novel modes of action (Tietjen et al., 2005).

Infection by a phytopathogenic fungus can only occur if the pathogen possesses all the necessary molecules to override plant defenses (van Baarlen et al., 2007; Hahn et al., 2014). Indeed, during the infection process the plant has the potential to mount a very effective defense for killing/confining its aggressor. In this process, the plant hormone salicylic acid (SA) is a key signal in the induction of the plant immune response to pathogens, and is therefore of great interest in plant pathology and crop protection. This hormone is responsible for controlling critical aspects of both basal and resistance gene based immunity, and for promotion of the long lasting, broadly effective immunity termed systemic acquired resistance (SAR) (Gaffney et al., 1993; Vlot et al., 2009; An and Mou, 2011). Such SAR enables plants to prepare for another attack and defend themselves more effectively against the pathogen (Dangl and Jones, 2001; Durrant and Dong, 2004). A late response is then implemented through the production of defense proteins and phytoalexins and the strengthening of the plant cell wall (Williamson et al., 2007; Mengiste, 2012; Hahn et al., 2014). Besides this function during biotic stress, it has also been found that SA plays a role in the plant response to abiotic stresses such as drought, chilling, heavy metal toxicity, heat, and osmotic stress as well as during plant growth and development (reviewed by Rivas-San Vicente and Plasencia, 2011).

For more than 200 years, SA (2-hydroxy benzoic acid) and derivatives have been studied for their medicinal use in humans (Vane and Botting, 2003; Jones, 2011). However, the extensive signaling role of SA in plants, particularly in defense against pathogens, has only become evident during the past 20 years (Ferrari et al., 2003; Rajjou et al., 2006; van Loon et al., 2006; Vlot et al., 2009; Zipfel, 2009; Hayat et al., 2010; El Oirdi et al., 2011; Caarls et al., 2015). SA derivatives are also widely distributed in plants. Methylsalicylate (MeSA; methyl 2-hydroxybenzoate) deserves special attention, as it is a volatile long distance signaling molecule that moves from infected to the non-infected tissues through phloem (Shulaev et al., 1997; Chen et al., 2003; Hayat et al., 2010). In plants, two enzymes control the balance between SA and MeSA: the SA binding protein 2 (SABP2) that converts biologically inactive MeSA into active SA (Forouhar et al., 2005), and the SA methyltransferase 1 (SAMT1) that catalyzes the formation of MeSA from SA (Ross et al., 1999; Park et al., 2007).

Several studies provided evidence for the ability of Botrytis to suppress host defense by different mechanisms. These include the manipulation of plant hormone pathways, in particular those that are involved in defense responses (reviewed by Mengiste, 2012). Besides Botrytis, a number of plant fungi, including pathogens (e.g., Magnaporthe oryzae, Ustilago maydis), endophytes (e.g., Piriformospora indica), and mutualists (e.g., Laccaria bicolor) also have the ability to suppress host defense (reviewed by Rovenich et al., 2014). For example, the degradation of SA by Aspergillus niger was reported (Krupka et al., 1967). More recently, the biotrophic fungus Ustilago maydis was shown to contain a cytosolic SA hydroxylase (also called acetylsalicylate deacetylase, EC 3.1.1.55), which is able to convert SA into catechol during the infection (Rabe et al., 2013). Similarly, the fungal plant pathogen Sclerotinia sclerotiorum proved able to degrade SA into catechol, most presumably through the action of an endogenous SA hydroxylase (Penn and Daniel, 2013). SA hydroxylase is also predicted as being a secreted protein in the plant pathogenic fungus Fusarium graminearum (Brown et al., 2012). Furthermore, unconventionally secreted isochorismatase effectors of two filamentous pathogens, Phytophthora sojae and Verticillium dahlia, were shown to disrupt the plant salicylate metabolism pathway by suppressing the production of its precursor (Liu et al., 2014). Thus, an increased degradation of this molecule or an inhibition of its biosynthesis could be effective strategies for biotrophic pathogens to suppress SA-mediated defense responses.

In addition the fact that SA (Prithiviraj et al., 1997; Amborabé et al., 2002; Cory and Cory, 2005; Meyer et al., 2006; Wu et al., 2008; Qi et al., 2012; Zhou et al., 2012; Panahirad et al., 2014), acetylsalicylic acid (ASA; 2-acetoxybenzoic acid) (Alem and Douglas, 2004; Stepanovic et ´ al., 2004; Leeuw et al., 2007, 2009; Moret et al., 2007; Sebolai et al., 2008; Trofa et al., 2009; Swart et al., 2011; Zhou et al., 2012) or MeSA (Schadler and George, 2006) can directly impede fungal growth has been repeatedly reported but the mechanisms of this direct attack process are unknown.

Therefore, the function of SA and/or derivatives in the infection process appears to be complex, encompassing at least three strategies: plant defense (e.g., signalization), degradation of SA by the fungal pathogen (e.g., via a fungal SA hydroxylase or biosynthesis inhibitors) and direct fungistatic effects (e.g., growth inhibition of the pathogen). In particular, several reports pointed out an intercellular antimicrobial role for SA during Pseudomonas infections in Arabidopsis (Cameron and Zaton, 2004; Carviel et al., 2009, 2014).

It is the aim of the present work to further document the possibility that SA can repress the growth of Botrytis. Toward this goal, we have used a physiological approach to confirm that SA and its derivatives MeSA and acetylsalicylic acid (ASA) could inhibit Botrytis growth. Then a proteomics approach was used to reveal potential proteins involved in Botrytis growth inhibition. Proteomics is a useful complement to transcriptomics since the latter does not capture the full complexity of cellular functions (Aebersold and Mann, 2003). Indeed, a focused study on proteins can determine their level and mode of expression, posttranslational modifications and the interactions they establish (Schwanhäusser et al., 2011). This approach already proved successful to characterize the proteome of mycelium tissue and the extracellular secretome from Botrytis (Fernández-Acero et al., 2006, 2010; Shah et al., 2009; Espino et al., 2010; Li et al., 2012; Delaunois et al., 2014; González et al., 2014; González-Fernández et al., 2014; Heard et al., 2015; for reviews on proteomics of phytopathogenic fungi, see González-Fernández and Jorrín-Novo, 2012; Bianco and Perrotta, 2015). In the present work, proteomic profiling by two-dimensional electrophoresis (2DE) in combination with mass spectrometry (MS) allowed detection and identification of statistically significant changes in the Botrytis proteome in the presence of different concentrations of SA or MeSA. After statistical analysis of the 2DE gels, several spots showed varying accumulation patterns in the presence of each compound, from which a number of proteins were identified by liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS). As a large number of the differentially accumulated proteins in the intracellular mycelium proteome potentially corresponded to secreted proteins, we also carried out comparative analyses of the Botrytis extracellular secretome in the absence or presence of SA or MeSA. The present results are discussed under the possibility that the signal molecules SA and MeSA may turn antifungal and vice versa in plant systems.

### MATERIALS AND METHODS

### Biological Material and Culture Conditions

Botrytis strain B05.10 was maintained on solid sporulation medium, as described by Rolland et al. (2009) and Cherrad et al. (2012). To study the mycelial radial growth, a plug of Botrytis mycelium was deposited at the center of a Petri dish (9 cm in diameter) containing a malt/agar medium composed of malt extract (20 g/L; Becton, Dickinson and Company), 2.0% glucose (w/v; Sigma), NH4Cl (0.1 M), and agar (15 g/L; Becton, Dickinson and Company) buffered at pH 5.0 or pH 7.0 (Tris-maleate 0.1 M), in the absence or presence of varying concentrations of SA, 5-sulfosalicylic acid (SSA; 2-hydroxy-5 sulfobenzoic acid), ASA (0.1 mM, 0.5 mM, 1 mM, 2.5 mM, or 5 mM) or MeSA (0.38 mM, 0.77 mM, 1.15 mM, 2.3 mM, or 5 mM), all compounds being obtained from Sigma. Mycelial radial growth was measured every day (four replicates including biological repeats). Cultures were carried out in a growth chamber thermostated at 21 ± 1 ◦C in the dark.

For proteomic analyses, the fungus was inoculated on cellophane sheets (Biorad) by streaking 1 × 10<sup>4</sup> spores gently over the surface of the membranes overlaid on the malt/agar medium described above (Shah et al., 2009; Mei et al., 2014) and transferred after 3 d on Gamborg medium (Gamborg et al., 1968) buffered at pH 5.0 (Tris-maleate 0.1M) and containing 0.1% glucose (w/v), supplemented or not with MeSA (0.38 mM) or SA (2.5 mM) during 24 h at 21◦C as described by Rolland et al. (2009) and Cherrad et al. (2012). Four biological replicates were carried out per assay. To collect intracellular proteins, the mycelium on the cellophane was lyophilized during 24 h and ground twice 30 s with the disrupter/homogenizer TissueLyser II (Qiagen). Proteins were solubilized in an aqueous solution containing 4% (w/v) CHAPS (Sigma) and 1% (v/v) Protease Inhibitor Cocktail for yeast (Sigma), for 1 h at 4◦C and then centrifuged at 5000 g for 10 min at 4◦C. To collect the secreted proteins, the liquid medium below the cellophane sheets was recovered and submitted to a clarifying centrifugation at 4◦C for 15 min at 5000 g. The corresponding supernatants were used for proteome and secretome analyses, respectively.

### Protein Extractions, 2D-PAGE and Densitometric Gel Analyses

Proteins were precipitated using trichloroacetic acid (TCA). TCA (10% w/v; Sigma) was added to the soluble proteins (intracellular mycelium proteome) or the centrifuged fungal media (extracellular secretome) and kept at 4◦C overnight. Proteins were pelleted by centrifugation at 14,000 g for 15 min at 4◦C and washed three times with glacial acetone (VWR Chemicals). Isoelectric focusing (IEF) was performed using the Protean IEF System (Biorad, France) according to the manufacturer's instructions. The rehydration buffer contained 8 M urea (Sigma-Aldrich), and 4% (w/v) CHAPS (Sigma). IEF was performed with 11 cm linear strips, pH 3–10 or pH 3– 6 (Biorad), using the Voltage Ramp protocol recommended by the manufacturer (100 V/30 min/rapid, 250 V/30 min/linear, 1000 V/30 min/linear, 7000 V/3 h/linear, and finally 32,000 V/h (pH 3–10 IPG) or 16,000 V/h (pH 3–6 IPG) (Cherrad et al., 2012). The second dimension was carried out using the Criterion Dodeca system (Biorad). A minimum of four gels loaded with biological replicates was used for each condition. Criterion any kD TGX gels (Biorad) were run at 10◦C in Laemmli buffer system (Laemmli, 1970) at 100 V for 2 h (Cherrad et al., 2012). 2Dgels were stained with silver nitrate as described (Catusse et al., 2008) then scanned and analyzed with the software SameSpots v.5 (Non-linear Dynamics Progenesis). A t-test of the spot volumes was calculated to compare the different treatments. Variations in spot volumes with p < 0.02 and fold-change >4 were considered significant.

### In-gel Digestion of Proteins and Sample Preparation for MS Analysis: Data Acquisition and Database Searching

Spots were destained in 25 mM ammonium bicarbonate (NH4HCO3), 50% (v/v) acetonitrile (ACN; VWR Chemicals) and shrunk in ACN for 10 min. After ACN removal, gel pieces were dried at room temperature. Proteins were digested by incubating each gel spot with 10 ng/µL of trypsin (T6567, Sigma-Aldrich) in 40 mM NH4HCO3, 10% (v/v) ACN, rehydrated at 4 ◦C for 10 min, and finally incubated overnight at 37◦C. The resulting peptides were extracted from the gel in three steps: a first incubation in 40 mM NH4HCO3, 10% (v/v) ACN for 15 min at room temperature and two incubations in 47.5% (v/v) ACN, 5% (v/v) formic acid (Sigma) for 15 min at room temperature. The three collected extractions were pooled with the initial digestion supernatant, dried in a vacuum centrifuge (SpeedVac; Eppendorf), and resuspended with 25µL of 0.1% (v/v) formic acid before performing the nanoLC-MS/MS analysis (Cherrad et al., 2012).

Peptide mixtures were analyzed by on-line capillary nano HPLC (LC Packings, Amsterdam, The Netherlands) coupled to a nanospray LCQ Deca XP ion trap mass spectrometer (ThermoFinnigan, San Jose, CA, USA). Ten microliters of each peptide extract were loaded on a 300µm ID × 5 mm PepMap C18 precolumn (LC Packings, Dionex, USA) at a flow rate of 20µL/min. After 5 min desalting, peptides were online separated on a 75µm internal diameter × 15 cm C18 PepMapTM column (LC Packings, Amsterdam, The Netherlands) with a linear gradient of solvent B (5–40%) and solvent A (95%–60%) in 48 min (solvent A was 0.1% (v/v) formic acid in 5% (v/v) ACN, and solvent B was 0.1% (v/v) formic acid in 80% (v/v) ACN). The separation flow rate was set at 200 nL/min. The mass spectrometer operated in positive ion mode at a 1.9-kV needle voltage and a 4-V capillary voltage. Data acquisition was performed in a data-dependent mode alternating in a single run, a MS scan survey over the range m/z 300–1700 and three MS/MS scans with Collision Induced Dissociation (CID) as activation mode. MS/MS spectra were acquired using a 2-m/z unit ion isolation window, a 35% relative collision energy, and a 0.5 min dynamic exclusion duration.

Mascot and Sequest algorithms through Proteome Discoverer 1.4 Software (Thermo Fisher Scientific Inc., USA) were used for protein identification against the Broad Institute Botrytis cinerea database (http://www.broadinstitute.org/annotation/ genome/botrytis\_cinerea/MultiHome.html; 16,448 entries; Amselem et al., 2011). Two missed enzyme cleavages were allowed. Mass tolerances in MS and MS/MS were set to 2 Da and 1 Da, respectively. Oxidation of methionine and carbamidomethylation on cysteine were searched as dynamic and static modifications, respectively. Peptide validation was performed using Target Decoy PSM Validator and only "high confidence" peptides were retained corresponding to a 1% false positive rate at peptide level. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (Vizcaíno et al., 2014) via the PRIDE partner repository (http:// www.ebi.ac.uk/pride/help/archive/about) with the dataset identifier PXD002873.

### Bioinformatics

The Fungal Secretome Database 3.0 (Choi et al., 2010) was used to collect annotations and signal peptide prediction programs (Bendtsen et al., 2004b; Emanuelsson et al., 2007; Caccia et al., 2013). SecretomeP 2.0 (http://www.cbs.dtu.dk/services/ SecretomeP/) was also used to provide information related to non-classical secretory proteins (Bendtsen et al., 2004a). Secreted proteins were classified into functional categories as described (Espino et al., 2010; Cherrad et al., 2012).

### RESULTS

### Growth Curves

The impact of salicylic acid (SA) and derivatives on mycelium growth of Botrytis is presented in **Table 1** and **Supplemental Figure S1**. It appears that methylsalicylic acid (MeSA) was the most active compound in impeding fungal growth, followed by acetylsalicylic acid (ASA), and SA. In contrast 5-sulfosalicylic acid (SSA) did not entail any growth reduction (**Table 1**; **Supplemental Figure S1**). At all times of mycelial cultures and used concentrations of SA and derivatives we checked that the pH of the culture media was not affected upon MeSA, SA, ASA, or SSA addition compared to control conditions (data not shown). Based on these results, to provide clues as to the molecular mechanisms underlying the Botrytis



Measurements of mycelium growth (four replicates) in control conditions or in the presence of 5 mM MeSA, SA, ASA, or SSA were taken at 6 d of the cultures as detailed in Materials and Methods. Standard deviations are shown.

response to two of the investigated compounds, MeSA and SA, we used a proteomics approach toward the characterization of the intracellular proteome and the extracellular secretome of the treated fungal cells. For these comparative proteomics experiments, we used the minimum concentration of MeSA (0.38 mM) or SA (2.5 mM) at which the smallest effect was observed on mycelial growth (**Supplemental Figure S1**). This protocol allows observation of the early events of the inhibition process and to minimize cell death and possible cell lysis that would complicate the analysis of the extracellular secretome.

### Intracellular Proteome Effect of MeSA

A typical 2D gel obtained for the MeSA-treated mycelium is shown in **Figure 1B**. By visual inspection, there was a major impact of MeSA on the mycelium proteome. Thus, a number of basic spots disappeared from the control accompanied by an increase in the number of acidic spots in the MeSA-treated proteome (compare **Figures 1A,B**). This was confirmed by global densitometric analyses of the 2D gels (**Supplemental Figure S2**).

Densitometric analyses of the 2D-gels from MeSA-treated vs. control mycelium (**Figures 1A,B**; four replicates) revealed that the volumes of 48 spots varied (p < 0.02; 4-fold change) (**Supplemental Table S1**; **Supplemental Figure S3**), of which 37 contained a single protein, 10 contained two proteins, and one contained three proteins for a total of 60 proteins (**Supplemental Table S1**). The largest functional category comprised 19 proteins (31.7%) and was associated with disease/defense, immunity/defense, and stress response mechanisms (collectively referred to as disease/defense/stress in **Figure 2**). The second and third largest functional categories were each composed of 18 proteins (30%). They were associated with the protein metabolism and modification category and with enzymes involved in various metabolic processes collectively referred to as metabolism in **Figure 2** of which 14 corresponded to various proteases (**Supplemental Table S1**; **Figure 2**; **Table 2**).

Out of the 60 proteins found in the 48 differentially accumulated spots, 18 (30%) exhibited a transit peptide. Yet SecretomeP predicted that 24 transit-peptide-devoid proteins could be secreted through non-canonical secretion pathways (Nickel, 2003). Thereby in total a large proportion (70%) of the

proteins found in the differentially accumulated spots in the presence of MeSA corresponded to putatively secreted proteins (**Supplemental Table S1**; **Figure 3**).

The same trends were observed for the 37 proteins obtained from differentially accumulated spots containing a single protein (**Supplemental Table S1**). The largest functional categories corresponded to protein metabolism and modification (16 proteins; 43.2%); proteins involved in various metabolic processes (11 proteins; 29.7%); and disease/defense, immunity/defense, and stress response mechanisms (nine proteins; 24.3%; **Table 2**; **Supplemental Table S1**). Sixteen proteins present in differentially accumulated spots containing a single protein were predicted to contain a transit peptide (TargetP, SignalP). Furthermore, SecretomeP predicted that 13 transit-peptide-devoid proteins could be secreted through non-canonical secretion pathways. Again, a large proportion (78.4%) of the MeSA-responsive proteins present in unique spots corresponded to putatively secreted proteins (**Supplemental Table S1**).

### Effect of SA

As for MeSA, the volume of a large the number of acidic spots increased in the SA-treated intracellular proteome, which was accompanied by a decreased number of basic spots from the control (compare **Figures 1A,C**; **Supplemental Figure S2**). Densitometric analyses of the 2D gels (**Figures 1A,C**; four replicates) revealed that the volumes of 60 spots varied (p < 0.02; 4-fold change; **Table 2**; **Supplemental Figure S3**; **Supplemental Table S1**), of which 33 contained a single protein, 19 contained two proteins, six contained three proteins, one contained four proteins, and one contained five proteins, for a total of 98 proteins.

As for MeSA, the three largest functional categories corresponded to protein metabolism and modification (51 proteins; 52%) of which 45 corresponded to various proteases; various proteins involved in metabolism (25 proteins; 25.5%); and disease/defense, immunity/defense, and stress response mechanisms (17 proteins; 17.3%) (**Supplemental Table S1**; **Figure 2**).

Out of the proteins found in the differentially accumulated spots, 59 (60.2%) exhibited a transit peptide. Furthermore, SecretomeP predicted that 19 proteins not predicted to contain a transit peptide could be secreted through non-canonical secretion pathways. Thereby, as in the case of MeSA a large proportion (80.6%) of the proteins found in differentially accumulated spots in the presence of SA corresponded to putatively secreted proteins (**Supplemental Figure S1**; **Figure 3**).

The same analysis was conducted for the 33 proteins present in spots containing a single protein. The three largest functional categories corresponded to protein metabolism and modification (12 proteins; 36.4%); metabolism (11 proteins; 33.3%); and disease/defense, and stress response mechanisms (nine proteins; 27.3%) (**Table 2**; **Supplemental Table S1**). Out of these 33 proteins, 14 were predicted to contain a transit peptide. Moreover, SecretomeP predicted that 10 proteins not predicted to contain a transit peptide could be secreted through noncanonical secretion pathways (**Supplemental Table S1**). Again, a large proportion (72.7%) of the SA-responsive proteins present in unique spots corresponded to putatively secreted proteins (**Table 2**; **Supplemental Table S1**).

### Extracellular Secretome

The above findings suggested that a large proportion of proteins found in differentially accumulated spots of the intracellular proteomes corresponded to potentially secreted proteins (**Supplemental Table S1**; **Figure 3**). To assess whether such modifications in intracellular protein abundance could be reflected at the level of the corresponding fungal extracellular secretomes, we prepared protein extracts from the extracellular growth media as described in Materials and Methods. Here the extracellular secretome corresponds to the proteins found in the liquid medium below the cellophane membrane on which fungal cells are grown. This prevents the contamination of the extracellular secretome (as presently defined) by intact fungal cells. Furthermore, growing fungal cells on a solid surface rather than a liquid media better reflects the conditions under which Botrytis infections of plants naturally occur (Shah et al., 2009). Typical gels of the extracellular secretomes obtained in such conditions are shown in **Figure 4**. These gels appeared to have a somewhat lower resolution than those obtained for the intracellular proteomes (**Figure 4**). In fact, when the extracellular secretomes of plant fungal pathogens or free-living fungi were analyzed in previous studies such behavior was repeatedly observed (Medina et al., 2005; Oda et al., 2006; Cobos et al., 2010; Espino et al., 2010; Fernández-Acero et al., 2010; Lu et al., 2010; Jung et al., 2012; Yang et al., 2012; González et al., 2013, 2014; Fernandes et al., 2014; Gómez-Mendoza et al., 2014). One reason could be the high amount of polysaccharides and the presence of low-molecular-weight metabolites in fungal secretomes (Chevallet et al., 2007; Erjavec et al., 2012; Fernandes et al., 2014). These molecules are known to interfere with protein extraction and separation methods (Lemos et al., 2010). Despite this difficulty, differentially accumulated spots could be revealed by image analysis of 2D gels upon MeSA or SA treatments of the Botrytis mycelia (**Figure 4**; **Supplemental Table S2**).

In the presence of MeSA six spots were differentially accumulated (**Supplemental Table S2**; **Supplemental Figure S4**). Two contained a single protein, one contained two proteins and two contained three proteins, for a total of 10 proteins. They all possessed a transit peptide (**Supplemental Table S2**). The observed functional categories were cellular control, carbohydrate metabolism, amino acid metabolism, and immunity and defense (**Supplemental Table S2**).

In the presence of SA, 22 spots exhibited significant volume variations with respect to control (**Supplemental Figure S4**). Among them, 16 contained a single protein (**Table 3**), four contained two proteins, and two contained three proteins, for a total of 30 proteins (**Supplemental Table S2**), of which 25 (83.3%) possessed a transit peptide (**Supplemental Table S2**). Furthermore, SecretomeP predicted that two of proteins that did not exhibit a transit peptide could be secreted through non-canonical secretion pathways (**Supplemental Table S2**). Thus, in total as much of 90% of the proteins found in differentially accumulated spots in the presence of SA were predicted as being secreted. These were distributed in functional TABLE 2 | Differentially accumulated intracellular proteins in the presence of MeSA or SA in the Botrytis culture medium compared to untreated control fungal cells.


Only the proteins present in differentially accumulated spots containing a single protein are listed. For other details, see Supplemental Table S1.

categories corresponding to carbohydrate metabolism, cell redox homeostasis, cellular control, disease/defense, protein metabolism, and modifications (**Supplemental Table S2**).

### DISCUSSION

In good agreement with previous studies showing that SA (Prithiviraj et al., 1997; Amborabé et al., 2002; Cory and Cory, 2005; Meyer et al., 2006; Wu et al., 2008; Qi et al., 2012; Zhou et al., 2012; Panahirad et al., 2014), ASA (Alem and Douglas, 2004; Stepanovic et al., 2004; ´ Leeuw et al., 2007, 2009; Moret et al., 2007; Sebolai et al., 2008; Trofa et al., 2009; Swart et al., 2011; Zhou et al., 2012) or MeSA (Schadler and George, 2006) can directly impede growth in several fungal species, the present study documents that of the four compounds analyzed (ASA, MeSA, SA, SSA) three of them (SA, ASA, and MeSA) showed fungistatic activity toward Botrytis. That SSA was not active in blocking Botrytis growth also agrees with the absence of reports reporting a fungistatic activity for this molecule. Very interestingly, several studies also reported that in addition to its role as a signaling molecule SA can also alter in vivo the growth of various microorganisms in interaction with plants. Thus, previous work in Arabidopsis indicated that accumulation of SA in the intercellular space is an important component of basal/PAMP-triggered immunity as well as effector-triggered immunity to pathogens that colonize the intercellular space (Cameron and Zaton, 2004; Carviel et al., 2009, 2014). The present data are also in good agreement with previous studies on the effect of SA on symbiotic root microbiomes both in bacteria and fungi (Medina et al., 2003; Stacey et al., 2006) and with recent results showing that plant SA, as well as exogenously applied SA, can help sculpt root microbiome by modulating colonization of the root by specific bacterial families (Lebeis et al., 2015).

Phenolic compounds, sodium salicylate, and related compounds have been reported to inhibit tumor cell growth in mouse leukemia L1210 cells (Cory and Cory, 2005). Interestingly, this study revealed that the IC<sup>50</sup> values (half maximal inhibitory concentrations) determined for these compounds correlated extremely well with the apparent ability of the drugs to enter the cells as estimated by the ratio of octanol-aqueous distribution (Leo et al., 1971; Unger et al., 1978); in particular this octanolaqueous distribution accounted for the very low activity observed with SSA compared to that measured for SA (Cory and Cory, 2005). In addition, it is known that SA methylation increases its membrane permeability, as well as its volatility, thus allowing more effective long distance transport of this defense signal (Dempsey et al., 2011). Hence a likely explanation for the observed differences in the ability of these compounds to inhibit Botrytis growth could be related to their relative efficiencies to penetrate into the fungal cells.

Knowing the existence of enzymes catalyzing the conversion of ASA or MeSA into SA, one may wonder whether these SA derivatives could directly inhibit Botrytis growth or if their action resulted from their conversion to SA or to MeSA. ASA esterase (EC 3.1.1.55), which catalyzes the hydrolysis of ASA to yield SA

and acetate, has been widely described in animals and human (Spenney and Nowell, 1979; Ali and Kaur, 1983; White and Hope, 1984; Kim et al., 1990), but there are no reports on the existence of this enzyme in plants or fungi. In agreement BLAST searches against the Botrytis genome did not confirm the existence of such an enzyme in this fungus (data not shown). In the case of MeSA, the tobacco methylsalicylate esterase SABP2 (EC 3.1.1.-), a 29 kDa protein, catalyzes the conversion of MeSA into SA to induce SAR (Kumar and Klessig, 2003; Forouhar et al., 2005; Tripathi et al., 2010). While BLAST searches confirmed the existence of SABP2 in various plant species they did not support the existence of such an enzyme in fungal species, notably in Botrytis (data not shown). Also, BLAST searches did not support the existence in Botrytis of a SA methyltransferase (SAMT 1) analogous to that found in tobacco (Ross et al., 1999; Park et al., 2007) (data not shown). The results therefore suggest that ASA, MeSA and SA are active per se in Botrytis growth inhibition.

We further show that MeSA and SA treatments substantially modify the Botrytis intracellular and extracellular mycelial proteome. In the following we discuss some specific aspects of the observed modifications.

### pI SHIFT

For both SA- and MeSA-treated intracellular mycelium proteomes we observed a large pI shift in the localization of the revealed spots on 2D gels. Thus, the addition of either of these two molecules in the Botrytis culture medium was accompanied with an accumulation of spots located in the acidic pI range of the 2D gels, while there was a decreased number of protein spots located in a more basic region of the 2D gels (**Figure 2**). As a somewhat similar behavior was noted when changing the pH of the culture medium from 5.0 to 7.0 (**Supplemental Figure S2**), it is possible that at least part of the effects of SA and MeSA reflects a change in pH regulation in Botrytis. Many fungi grow over a wide pH range and their gene expression is tailored to the environmental pH. In Aspergillus nidulans, the transcription factor PacC, an activator of genes expressed in alkaline conditions and a repressor of those expressed in acidic conditions, undergoes two consecutive proteolytic events, the first being pH-signal dependent and the second proteasomal (Peñalva et al., 2008). In previous work we suggested a possible link between pH regulation and metal response in Botrytis (Cherrad et al., 2012). Consistent with this, it has been documented that Rim101 (the ortholog of PacC in yeasts) and PacC are involved in metal (iron or zinc) homeostasis in yeasts (Conde e Silva et al., 2009; Ariño, 2010; Linde et al., 2010) and filamentous fungi such as Aspergillus fumigatus (Amich et al., 2009; Cherrad et al., 2012). Furthermore, in Aspergillus nidulans, biosynthesis and uptake of siderophores are regulated not only by iron availability but also by ambient pH through the transcription factor PacC (Eisendle et al., 2004). In this context, it is therefore of interest to note that SA and its derivatives can form chelate compounds with metal ions (Perrin, 1958). Hence an alteration in metal homeostasis could provide an explanation for the observed changes in the intracellular proteomes in the presence of MeSA or SA.

Although in our experiments the extracellular pH was not affected upon addition of ASA, MeSA, or SSA to the culture medium, we cannot rule out the possibility that accumulation of these molecules within Botrytis cells entailed a modification of the intracellular pH that would have been perceived by the

Methods, differentially accumulated spots (p < 0.02; 4-fold change) were submitted to MS analysis and the proteins characterized were listed in Supplemental Table S2.

PacC regulatory system. Further, work is needed to address this question.

### Metabolism

Nulton-Persson et al. (2004) reported that treatment of isolated cardiac mitochondria with SA or ASA resulted in alterations of mitochondrial respiration, most presumably through inhibition of the Krebs cycle complex alpha-ketoglutarate dehydrogenase (KGDH; EC 1.2.4.2). It is therefore interesting to note that mitochondrial dihydrolipoyl dehydrogenase (EC 1.8.1.4), the E3 component of the KGDH complex, was strongly up accumulated in the SA- and ASA-treated intracellular mycelial proteomes (**Supplemental Table S1**). It is noted also that the accumulation of aconitase (one another enzyme in the Krebs cycle; EC 4.2.1.3) was modified in the SA- and ASA-treated intracellular mycelial proteomes (**Supplemental Table S1**). Since a number of fungicides behave as very potent inhibitors of mitochondrial respiration (Gisi et al., 2002; Avenot and Michailides, 2010; Belenky et al., 2013; Sierotzki and Scalliet, 2013), one possibility to account for the inhibition of Botrytis growth in the presence of MeSA, ASA or SA could rely on perturbation of mitochondrial respiration.

### Proteases

For the MeSA-treated mycelium 14 proteins corresponded to various proteases (**Supplemental Table S1**; **Figure 2**) corresponding to five different enzymes, BC1G\_02949 (tripeptidyl-peptidase 1), BC1G\_03070 (rhizopuspepsin-2), BC1G\_03711 (serine carboxypeptidase 3), BC1G\_06836 (subtilase-type proteinase psp3), and BC1G\_06849 (vacuolar protease A). For the SA-treated intracellular mycelial proteome, the main functional category of the proteins found to be present in the highly accumulating spots corresponded to proteases (**Supplemental Table S1**). Most of them (88.8%) were predicted as being secreted and corresponded to six types of proteases, namely BC1G\_01026 and BC1G\_02944 (tripeptidyl-peptidase 1, a serine protease), BC1G\_03070 (rhizopuspepsin-2, an aspartic protease), BC1G\_03711 (serine carboxypeptidase 3, a serine protease), BC1G\_06320 (dipeptidase 1, a zinc metallopeptidase), BC1G\_06836 (subtilase-type proteinase psp3, a serine protease), and BC1G\_06849 (vacuolar protease A, an aspartic protease) (**Supplemental Table S1**; **Figure 2**; http://merops.sanger.ac.uk/ cgi-bin/speccards?sp=sp001886;type=peptidase). Five of the presently characterized proteases (BC1G\_01026, BC1G\_02944, BC1G\_03070, BC1G\_06836, and BC1G\_06849) have previously been found in the extracellular Botrytis secretome (Espino et al., 2010; Fernández-Acero et al., 2010; Li et al., 2012; González-Fernández et al., 2014). Of these, only BC1G\_03070 was presently found in differentially accumulated spots of the extracellular secretome in the presence of SA (**Supplemental Table S2**). One possibility to explain this behavior could rely on the different growing conditions used in the present work and in previous studies aiming at characterizing the Botrytis secretome. As stressed by Espino et al. (2010) the Botrytis secretome is highly adaptive, in the sense that very different sets of proteins are TABLE 3 | Differentially accumulated extracellular proteins in the presence of MeSA or SA in the Botrytis culture medium compared to untreated control fungal cells.


Only the proteins present in differentially accumulated spots containing a single protein are listed. For other details, see Supplemental Table S2.

detected in the secretome when the growth conditions, or the age of the mycelium, differ. In particular these authors observed that only three proteins from a total number of 238 experimentally characterized extracellular proteins are present in 10 different experimental conditions, implying that Botrytis can greatly alter the composition of the secreted protein pool to meet the requirement of the different growing needs (Espino et al., 2010).

Besides the well-established role of extracellular proteolytic activity in fungal pathogenicity (Naglik et al., 2003; ten Have et al., 2010; Jashni et al., 2015), intracellular proteinases were also shown to play important functional roles in fungi (Li and Kane, 2009). In particular, vacuolar proteases have been reported to be essential in morphogenesis and adaptation to ambient nutritional conditions (Yike, 2011). We note that BC1G\_06849 presently found in the intracellular proteome of Botrytis cells treated with either SA or MeSA (**Supplemental Table S1**) is predicted to encode a vacuolar protease (ten Have et al., 2010).

### ROS Detoxification

Three enzymes involved in ROS detoxification, namely catalase (BC1G\_12146; EC 1.11.1.6), ascorbate peroxidase (BC1G\_08301; EC 1.11.1.11), manganese superoxide dismutase (BC1G\_01910; EC 1.15.1.1) and peroxiredoxin (BC1G\_09932; EC 1.11.1.15) were strongly depressed in the intracellular mycelium proteome upon MeSA or SA addition (**Supplemental Table S1**). Therefore, it is possible that the reduction of fungal growth observed in the presence of these compounds arose from an increased ROS accumulation resulting in an oxidative stress. It is worth noting that in plants SA specifically inhibits the H2O2-degrading activity of catalase (Chen et al., 1993) and of ascorbate peroxidase (Durner and Klessig, 1995), while SA treatment induces an increase in H20<sup>2</sup> concentrations in vivo, suggesting that SA may facilitate H2O<sup>2</sup> accumulation during the oxidative burst induced by infection with avirulent pathogens (reviewed by Vlot et al., 2009). It is therefore very interesting to note that SA could target the same enzymes in plants and phytopathogenic fungi.

### Cell Wall Remodeling and Integrity

An analysis of the intracellular mycelium proteome revealed that spots containing the cerato-platanin related protein (CPP) had reduced volumes upon SA addition to the culture medium (**Supplemental Figure S1**). The CPPs, originally discovered in Ceratocystis fimbriata f. sp. platani (Pazzagli et al., 1999), are elicitors or effector proteins that belong to a larger class of fungal proteins defined as SSCPs or SSPs: small, secreted (cysteine rich) proteins (Lamdan et al., 2015). Thus, Frías et al. (2011, 2013, 2014) showed that BcSpl1, a member of the CPP family, is required for full virulence in Botrytis and elicits the hyper sensitive response in the plant host. Also, ectopic expression of the Magnaporthe oryzae CPP gene, MgSM1, upregulates the expression of plant defense genes such as PR-1, PR-5, and PDF1.2 and induces local hypersensitivity reactions (Yang et al., 2009). It is therefore very interesting to note that the same SA molecule can both elicit HR in plants and alter the accumulation of CPPs in phytopathogenic fungi. Presumably this would allow to fine tune the regulation of HR during plant infection by phytopathogenic fungi. CPPs are structurally related to expansins, which are proteins associated with carbohydrate binding and loosening of the cellulose scaffolds in plant cell walls (de Oliveira et al., 2011). Therefore in addition to such a role in virulence, fungal CPPs might also be involved in growth and development, which should justify their presence in the fungal cell wall where they could act by disrupting non-covalent interactions between fungal cell wall components: for example, between β-glucan or chitin chains (Gaderer et al., 2014; Baccelli, 2015). In this way CPPs could act in all those processes requiring remodeling and enlargement of the fungal cell wall (Baccelli, 2015). Our present observation might support such a role, thus accounting for Botrytis growth reduction in the presence of MeSA and SA. It is also interesting to note that the accumulation of the Botrytis Bcspl1 CPP (BC1G\_02163) dramatically increased in the extracellular secretome in the presence of plant extracts (Shah et al., 2009), suggesting that the accumulation of this protein in the extracellular secretome is modulated by the interaction between plant and fungal cells.

Another protein revealed in this work concerned the sporulation-specific protein 2 (BC1G\_10630) that corresponds to the GPI-anchored cell wall organization protein ECM33 protein of Erysiphe necator (Pardo et al., 2004). This protein was detected in spots showing increased volumes in the intracellular and extracellular proteomes of SA-treated Botrytis cells (**Supplemental Table S1**). This cell wall protein ECM33 was shown to be important for cell wall integrity in term of correct assembly of the mannoprotein layer (Pardo et al., 2004; Chabane et al., 2006; Martinez-Lopez et al., 2006). The present results suggest that the observed alterations in the accumulation of this protein in response to SA or MeSA treatment may represent a compensatory response to reduced cell wall integrity in Botrytis.

### CONCLUSION

In the present work, we studied the impact of SA and SA derivatives on Botrytis growth. By using proteomics we revealed several potential mechanisms that could account for the observed fungal growth inhibition, notably pH regulation, metal homeostasis, mitochondrial respiration, ROS accumulation, and cell wall remodeling. During infection there is ample circumstantial evidence that the plant host synthesizes SA as a signaling molecule to induce the HR response. Apoplastic SA concentrations in tobacco can be quite high at the HR lesions (Huang et al., 2006). Provided that the actual SA (or MeSA) concentrations at the infection points are high enough the present observations support a role played by the phytohormone SA and derivatives in directly containing the pathogens. As stressed by Vlot et al. (2009) SA is indeed a multifaceted hormone to combat disease.

### ACKNOWLEDGMENTS

We thank Marie-Pascale Latorse (Bayer CropScience, Lyon), Nathalie Poussereau (UMR5240, Lyon), Cécile Ribot (UMR5240, Lyon) and Catherine Sirven (Bayer CropScience, Lyon) for helpful discussions. We are grateful to the Reviewers for their helpful comments that have led to important improvements of the original manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00859

Supplemental Figure S1 | Relative growth of Botrytis in the absence (control) or presence of the indicated compounds (ASA, SA, MeSA, and

### REFERENCES


SSA). Measurements of mycelium growth (four replicates) were effected at 3 d of the cultures as detailed in Materials and Methods.

Supplemental Figure S2 | Distribution of the total absorbance in acidic (1) and basic (2) regions of 2D gels for intracellular mycelium proteome. (A) Acidic and basic regions of 2D gels. Gels shown corresponded to untreated mycelium grown at pH 5.0 (Control pH 5.0) or at pH 7.0 (Control pH 7.0) or to MeSA-treated mycelium grown at pH 5.0 (MeSA pH 5.0) or to SA-treated mycelium grown at pH 5.0 (SA pH 5.0). The acidic region (1) corresponded to the 3.0–6.0 pI range and the basic region corresponded to the 6.0–10.0 pI range of the 2D gels (separated by the blue lines). (B) Measurements of total absorbance in regions 1 and 2 of the 2D gels shown in (A). The total absorbance in each of the

two regions was measured using the software Mesurim (http://acces.ens-lyon.fr/ acces/logiciels/mesurim/guide-dutilisation/mesures-sur-limage#lumsurf) and three replicates for each condition. The ratios of total absorbance in region 2/total absorbance in region 1 are listed.

Supplemental Figure S3 | Spot numbering for spots showing variations in spot volumes in the Botrytis intracellular mycelium proteome treated with MeSA or SA. The differentially accumulated spots are depicted on a typical 2D-gel corresponding to the intracellular proteome of untreated Botrytis control: Red arrows, differentially accumulated spots from MeSA-treated mycelium; green arrows, differentially accumulated spots from SA-treated mycelium; blue arrows, differentially accumulated spots from both MeSA- and SA-treated mycelium. The proteins contained in the various differentially accumulated spots are listed in Supplemental Table S1.

Supplemental Figure S4 | Spot numbering for spots showing variations in spot volumes in the Botrytis extracellular mycelium secretome upon mycelium treatment with MeSA (0.38 mM) or SA (2.5 mM). The differentially accumulated spots are displayed on a typical 2D-gel corresponding to the extracellular secretome of control untreated Botrytis: Red arrows, differentially accumulated spots from MeSA-treated mycelium; green arrows, differentially accumulated spots from SA-treated mycelium; blue arrows, differentially accumulated spots from both MeSA- and SA-treated mycelium. The proteins contained in the various differentially accumulated spots are listed in Supplemental Table S2.

Supplemental Table S1 | List of intracellular proteins present in differentially accumulated spots upon comparing control untreated and MeSA- or SA-treated Botrytis mycelium. MeSA or SA concentrations were at 0.38 mM or 2.5 mM, respectively.

Supplemental Table S2 | List of proteins present in differentially accumulated spots of extracellular secretomes upon comparing control untreated and MeSA- or SA-treated Botrytis mycelium. MeSA or SA concentrations were at 0.38 mM or 2.5 mM, respectively.


lipase activity. Proc. Natl. Acad. Sci. U.S.A. 100, 16101–16106. doi: 10.1073/pnas.0307162100


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Dieryckx, Gaudin, Dupuy, Bonneu, Girard and Job. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Temperature Modulates the Secretome of the Phytopathogenic Fungus Lasiodiplodia theobromae

Carina Félix<sup>1</sup> , Ana S. Duarte<sup>1</sup> , Rui Vitorino2,3, Ana C. L. Guerreiro<sup>4</sup> , Pedro Domingues<sup>4</sup> , António C. M. Correia<sup>1</sup> , Artur Alves<sup>1</sup> and Ana C. Esteves<sup>1</sup> \*

<sup>1</sup> Department of Biology, Centre for Environmental and Marine Studies, University of Aveiro, Aveiro, Portugal, <sup>2</sup> Department of Medical Sciences, Institute for Research in Biomedicine, University of Aveiro, Aveiro, Portugal, <sup>3</sup> Department of Physiology and Cardiothoracic Surgery, Faculty of Medicine, University of Porto, Porto, Portugal, <sup>4</sup> Department of Chemistry and QOPNA, University of Aveiro, Aveiro, Portugal

#### Edited by:

Delphine Vincent, Department of Environment and Primary Industries, Australia

#### Reviewed by:

Zonghua Wang, Fujian Agriculture and Forestry University, China Robyn Peterson, Macquarie University, Australia

> \*Correspondence: Ana C. Esteves acesteves@ua.pt

#### Specialty section:

This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science

Received: 27 May 2016 Accepted: 11 July 2016 Published: 03 August 2016

#### Citation:

Félix C, Duarte AS, Vitorino R, Guerreiro ACL, Domingues P, Correia ACM, Alves A and Esteves AC (2016) Temperature Modulates the Secretome of the Phytopathogenic Fungus Lasiodiplodia theobromae. Front. Plant Sci. 7:1096. doi: 10.3389/fpls.2016.01096 Environmental alterations modulate host–microorganism interactions. Little is known about how climate changes can trigger pathogenic features on symbiont or mutualistic microorganisms. Current climate models predict increased environmental temperatures. The exposing of phytopathogens to these changing conditions can have particularly relevant consequences for economically important species and for humans. The impact on pathogen/host interaction and the shift on their biogeographical range can induce different levels of virulence in new hosts, allowing massive losses in agricultural and health fields. Lasiodiplodia theobromae is a phytopathogenic fungus responsible for a number of diseases in various plants. It has also been described as an opportunist pathogen in humans, causing infections with different levels of severity. L. theobromae has a high capacity of adaptation to different environments, such as woody plants, moist argillaceous soils, or even humans, being able to grow and infect hosts in a wide range of temperatures (9–39◦C). Nonetheless, the effect of an increase of temperature, as predicted in climate change models, on L. theobromae is unknown. Here we explore the effect of temperature on two strains of L. theobromae – an environmental strain, CAA019, and a clinical strain, CBS339.90. We show that both strains are cytotoxic to mammalian cells but while the environmental strain is cytotoxic mainly at 25◦C, the clinical strain is cytotoxic mainly at 30 and 37◦C. Extracellular gelatinolytic, xylanolytic, amylolytic, and cellulolytic activities at 25 and 37◦C were characterized by zymography and the secretome of both strains grown at 25, 30, and 37◦C were characterized by electrophoresis and by Orbitrap LC-MS/MS. More than 75% of the proteins were identified, mostly enzymes (glycosyl hydrolases and proteases). The strains showed different protein profiles, which were affected by growth temperature. Also, strain specific proteins were identified, such as a putative f5/8 type c domain protein – known for being involved in pathogenesis – by strain CAA019 and a putative tripeptidylpeptidase 1 protein, by strain CBS339.90. We showed that temperature modulates the secretome of L. theobromae. This modulation may be associated with host-specificity requirements. We show that the study of abiotic factors, such as temperature, is crucial to understand host/pathogen interactions and its impact on disease.

Keywords: phytopathogenic fungi, extracellular enzymes, secretome, cytotoxicity, global changes

### INTRODUCTION

fpls-07-01096 July 29, 2016 Time: 12:4 # 2

It is widely accepted that the climate is changing at a global level. We will witness increased temperature and climatic extremes such as drought, floods, and storms (Piñeiro et al., 2010; Galant et al., 2012). Nonetheless, little effort has been directed to the identification of the impact that these forecasted conditions, specifically increased temperature, will have on microbial pathogen (MP)/host interactions (Eastburn et al., 2011; Gallana et al., 2013). Stress induced by increased temperature experienced by MPs will certainly impact the dynamics of host/pathogen interactions and ultimately result in changes in virulence (Lindner et al., 2010). Altered environmental conditions are causing many organisms to shift their biogeographic distribution ranges (MacDonald et al., 2008), and the same may be occurring with microorganisms (Azevedo et al., 2012; Bebber et al., 2013). The study of increased environmental temperature on the behavior of phytopathogens is therefore of extreme relevance.

Fungi can establish commensal and pathogenic relationships with their hosts that can be altered by discrete environmental changes, inducing a commensal relationship to evolve into a pathogenic one (Bliska and Casadevall, 2009). Furthermore, endophytes and plant pathogens have extraordinarily similar methods of invasion, suggesting a similarity of attributes related to the adaptation of a fungus to its host (van der Does and Rep, 2007; Hube, 2009; Blauth de Lima et al., 2016). A number of fungal molecules, like cell wall degrading enzymes (CWDEs), inhibitory proteins and enzymes involved in toxin synthesis, are known to contribute to fungal pathogenicity and virulence (King et al., 2011; Gonzalez-Fernandez and Jorrin-Novo, 2012). Proteomics is a powerful tool to identify unknown mechanisms underlying environmental alterations (Lemos et al., 2010; Alves et al., 2015). In this context, the analysis of the extracellular proteome, the secretome, allows identifying which proteins are involved in the interaction with the host and attempt to relate them with fitness and/or pathogenicity mechanisms (Bregar et al., 2012; Gonzalez-Fernandez and Jorrin-Novo, 2012; Gonzalez-Fernandez et al., 2015). The analysis of the secretome has been successfully made for phytopathogenic fungi, such as Botrytis cinerea (Zhang et al., 2014), Diplodia corticola (Fernandes et al., 2014) or Verticillium albo-atrum (Mandelc and Javornik, 2015).

Lasiodiplodia theobromae (Pat.) Griff. & Maubl. is a phytopathogenic fungus typical of the tropics and subtropics (Alves et al., 2008; Phillips et al., 2013). Despite being able to grow between 9 and 39◦C, its optimal growth temperature is 27–33◦C (D'souza and Ramesh, 2002). Widely distributed, it is mostly confined to 40◦ North and 40◦ South of the equator. Although L. theobromae has the ability to colonize healthy tissues without causing any harm (Jami et al., 2013), disease may appear if the plant is under stress. Therefore, it has been considered as a latent pathogen, capable of inducing endophytic infections (Jami et al., 2013). It has been associated to approximately 500 hosts, mostly woody plants, such as Eucalyptus spp. and to different fruit trees, like grapevines (Phillips et al., 2013; Rodríguez-Gálvez et al., 2015). Lasiodiplodia theobromae has also been associated to a number of cases of human infections, behaving as an opportunist (Summerbell et al., 2004; Kindo et al., 2010; Saha et al., 2012a,b). The most common cases are ocular infections but human death has been reported (Woo et al., 2008).

In this study the effect of temperature on two strains of L. theobromae was investigated; an environmental (CAA019) and a clinical strain (CBS339.90). Lasiodiplodia theobromae metabolome has been widely studied, but the enzymes, and other proteins, expressed by this organism have never been investigated. Therefore, the effect of temperature on the production of extracellular enzymes, on the secretome and on cytotoxicity of the secretome was evaluated.

### MATERIALS AND METHODS

### Microorganisms

The strains used in this study were: CAA019, isolated from Cocos nucifera L. in Brazil, and CBS339.90, isolated from a phaeohyphomycotic cyst of patient from Jamaica (Alves et al., 2008). CBS339.90 was obtained from the Centraalbureau voor Schimmelcultures (CBS) Fungal Biodiversity Centre. Cultures were maintained on PDA (19.5 g.L−<sup>1</sup> ; Potato Dextrose Agar; Difco).

### Radial Growth

Fungal growth was evaluated based on the development of the mycelium in solid media (PDA, Czapek, Oat Meal Agar and Corn Meal Agar). The plates were inoculated with a 7 mm-diameter agar plug from an actively growing fungal culture in PDA at 1 cm from the border of the plate and incubated at 25, 30, and 37◦C. After 48 h, the colony radius was measured. Assays were carried out in triplicate and data is presented as average ± standard error.

### Biomass

Two plugs of 7 mm-diameter from an actively growing culture on PDA were inoculated on 50 mL of Potato Dextrose Broth (PDB) medium and incubated at 25, 30, or 37◦C. After 24, 48, 72, 96, 120, 168, and 360 h (1–15 days), the mycelium was separated from the culture medium by filtration (filter paper). The mycelium was dried at 50◦C for 48 h and the dry weight determined.

### Extracellular Enzymes

The different agar media plates were inoculated with a 7 mmdiameter agar plug from an actively growing culture and incubated at 25, 30, and 37◦C for 48 h, unless otherwise stated. All assays were carried out in triplicate and data is presented as average ± standard error.

### Detection and Quantification of Enzymatic Activity

The presence of caseinases, cellulases, amylases, xylanases, pectinases, ureases, and laccases was detected as described earlier (Esteves et al., 2014). Briefly, the various substrates [1% (w/v) skimmed milk, 0.5% (w/v) carboxymethylcellulose, 0.2% (w/v) starch, 0.5% (w/v) xylan, 0.5% (w/v) pectin, 2% (w/v) urea, and 1% (w/v) tannic acid, respectively] were independently added to a solution of 0.5% (w/v) malt extract and 1.5% (w/v) agar. The activities were detected by the formation of a halo around the mycelium (caseinases, ureases, and laccases)

or after the addition of Lugol solution (amylases), Congo Red (cellulases and xylanases) or cetyltrimethyl ammonium bromide (pectinases).

Gelatinases were detected using a gelatin medium [1% (w/v) gelatin, 0.5% (w/v) malt extract, 1.5% (w/v) agar]. The plates were inoculated and the degradation of gelatin was detected as a clear halo around colonies, against an opaque background.

The activity was determined as a percentage of the maximum halo (cm) measured for each activity assayed.

### Characterisation of Extracellular Enzymes by Zymography

Strains were grown as follows: two plugs of 7 mm-diameter from an actively growing culture on Potato Dextrose Agar (PDA) were inoculated on 50 mL of Potato Dextrose Broth (PDB) medium and incubated at the appropriate temperature for 28 days. Aliquots were taken every 48 or 72 h and stored at −80◦C until analysis. The mycelium was separated from the culture medium by filtration (filter paper).

The characterisation of extracellular enzymes was accessed by zymography (Esteves et al., 2014). Extracellular media were diluted in sample buffer [2:1 (v/v); 62.5 mM Tris, pH 6.8, 10% SDS (w/v) and 20% glycerol (v/v)] and incubated at room temperature during 10 min. Proteins were then separated in labcast gels (10% polyacrylamide with the appropriated substrate) in a Mini-PROTEAN 3 (Bio-Rad) according to (Laemmli, 1970). Electrophoresis proceeded at 120 volts for 120 min at 4◦C. After electrophoresis, the gel was washed twice with 0.25% Triton <sup>R</sup> X-100 (v/v) for 60 min to remove SDS.

Gel analysis was performed after staining the proteins and scanned on a GS-800 Calibrated Densitometer (Bio-Rad). Quantity One v. 4.6.9 (Bio-Rad) was used to estimate the molecular mass of proteins and their optical densities. The apparent molecular weight (MW) of the proteins was determined using a MW calibration kit as marker, consisting of a mixture of proteins with 250, 150, 100, 75, 50, 37, 25, 20, 15, and 10 kDa (Precision Plus Protein Standard, Bio-Rad). Only gels where activity was detected are shown.

### **Xylanases**

Xylanolytic activity was characterized by zymography, as described previously (Peterson et al., 2011) with slight modifications. One percent xylan was incorporated in the gel. After electrophoresis, the gel was incubated overnight at 25◦C in 0.05 M Tris-HCl, pH 5.0, stained with Congo Red solution (1%) for 10 min. The gel was rinsed with a solution of 1 M NaCl. Enzymes with xylanolytic activity were detected as clear bands against a red background of non-degraded substrate.

### **Cellulases**

Cellulolytic activity was assessed by zymography, as described previously (Peterson et al., 2011) with slight modifications. Carboxymethylcellulose (1%) was incorporated in the gel. After electrophoresis, the gel was incubated overnight at 25◦C in 0.05 M Tris-HCl, pH 5, stained with Congo Red solution (1%) for 10 min. The gel was rinsed with 1 M NaCl. Enzymes with cellulolytic activity were detected as clear bands against a red background of non-degraded substrate.

### **Amylases**

Amylolytic activity was assessed by zymography using 1% (w/v) starch, as described previously (Peterson et al., 2011) with slight modifications. After electrophoresis, the gel was incubated 3 h at 40◦C in 0.05 M Tris-HCl, pH 5, stained with 1 mL Lugol's iodine stock solution [(0.05 g I2, 0.1 g.mL−<sup>1</sup> KI (Potassium Iodide)] in 50 mL distilled water. The gel was rinsed with distilled water. Enzymes with amylolytic activity were detected as clear bands against a dark background of non-degraded substrate.

### **Proteases**

Gelatinolytic and caseinolytic activity was assessed by zymography, using 1% (w/v) gelatine or 1% (w/v) casein, as described previously (Duarte et al., 2009; Esteves et al., 2014), with slight modifications. After electrophoresis, the gel was incubated overnight, at room temperature, in 1.5 mM Tris, pH 8.8, 1 M NaCl, 1 M CaCl2, 2 mM ZnCl2, pH 7.4, stained with Coomassie Brilliant Blue R-250 [(in 50% ethanol (v/v), 10% acetic acid (v/v)] and destained with 25% ethanol (v/v), 5% acetic acid ( bgv/v). Enzymes with gelatinolytic/caseinolytic activity were detected as clear bands against a blue background of non-degraded substrate.

### Protein Quantification

Protein quantification was made using BCA Protein Assay Kit (PierceTM, Rockford, IL, USA), according to the manufacturer's instructions. All the samples were quantified in triplicate.

### Secretome Analysis

Two mycelial plugs with 7 mm were used to inoculate the fungi into 50 mL of PDB medium. Cultures were grown for 72 h at 25, 30, and 37◦C, in 250 mL Erlenmeyers flasks.

Extracellular medium of each strain was diluted (1:1) in loading buffer [2% (v/v) 2-mercaptoethanol, 2% (w/v) SDS, 8 M Urea, 100 mM Tris, 100 mM Bicine and traces of Bromophenol blue] and analyzed by electrophoresis (Laemmli, 1970). Lab-cast SDS-PAGE gels ran at 120 V for 2 h on 15% (w/v) acrylamide running gels. The running buffer contained 100 mM Tris, 100 mM Bicine and 0.1% (w/v) SDS. The samples were denatured at 100◦C for 5 min prior to electrophoresis. Gel staining and image acquisition and analysis was as described before (Santos et al., 2013; Costa et al., 2014; Alves et al., 2015). All visible bands were manually excised and proteins were identified by Orbitrap LC mass spectrometry.

A Permutational Multivariate Analysis of Variance (RStudio) was employed using R package 'vegan' and the RStudio v 0.98.1103 interface (Oksanen et al., 2015; R Core Team, 2015) to understand which factor (strain or temperature) – if any – has the main effect on the protein profile of L. theobromae.

### Tryptic Digestion, Mass Spectrometry Analysis, and Protein Identification

Tryptic digestion was performed according to (Carvalhais et al., 2015), with a few modifications. Protein bands were manually excised from the gel and transferred to eppendorf tubes. Replicate bands were excised and also identified. The gel pieces were washed three times with 25 mM ammonium bicarbonate/50%

acetonitrile (ACN, VWR Chemicals) and one time with ACN. The protein's cysteine residues were reduced with 6.5 mM DTT and alkylated with 54 mM iodo-acetamide. Gel pieces were dried in a SpeedVac (Thermo Savant) and rehydrated in digestion buffer containing 12.5 µg.mL−<sup>1</sup> sequence grade modified porcine trypsin (Promega) in 25 mM ammonium bicarbonate. After 90 min, the supernatant was removed and discarded, 100 µL of 25 mM ammonium bicarbonate were added and the samples were incubated overnight at 37◦C. Extraction of tryptic peptides was performed by the addition of 10% formic acid (FA, Fluka)/50% ACN three times and finally with ACN. Tryptic peptides were lyophilized in a SpeedVac (Thermo Savant) and resuspended in 5% ACN/0.1% FA solution. The samples were analyzed with a QExactive Orbitrap (Thermo Fisher Scientific, Bremen) that was coupled to an Ultimate 3000 (Dionex, Sunnyvale, CA, USA) HPLC (high-pressure liquid chromatography) system. Prior to sample analysis, a complex mixture of peptides was obtained from the reduction, alkylation and tryptic digestion of six proteins (Sciex iTRAQ standard mixture), namely bovine serum albumin (P02769), Escherichia coli β-galactosidase (P00722), bovine α-lactalbumin (P00711), bovine β-lactoglobulin (P02754), chicken lysozyme C (P00698) and human serotransferrin (P02787). This peptide mixture was routinely used to test the nanoLC-MS/MS system performance, showing a protein identification coverage between 70 and 80% for a 100 ng injection.

The trap (5 mm × 300 µm I.D.) and analytical (150 mm × 75 µm I.D.) columns used were C18 Pepmap100 (Dionex, LC Packings), the latter having a particle size of 3 µm. Peptides were trapped at 30 µL.min−<sup>1</sup> in 95% solvent A (0.1% FA/5% ACN v/v). Elution was achieved with the solvent B (0.1% formic acid/100% acetonitrile v/v) at 300 nL.min−<sup>1</sup> . The 50 min gradient used was as follows: 0–3 min, 95% solvent A; 3–35 min, 5– 45% solvent B; 35–38 min, 45–80% solvent B; 38–39 min, 80% solvent B; 39–40 min, 20–95% solvent A; 40–50 min, 95% solvent A. Nanospray was achieved using an uncoated fused silica emitter (New Objective, Cambridge, MA, USA; o.d. 360 µm; i.d. 50 µm, tip i.d. 15 µm) biased to 1.8 kV. The mass spectrometer was operated in the data dependent acquisition mode. A MS2 method was used with a FT survey scan from 375 to 1600 m/z (resolution 35,000; AGC target 3E6). The 10 most intense peaks were subjected to HCD fragmentation (resolution 17,500; AGC target 5E4, NCE 25%, max. injection time 120 ms, dynamic exclusion 35 s). Spectra were processed and analyzed using Proteome Discoverer (version 2.0, Thermo), with the MS Amanda search engine (version 2.1.4.3751, University of Applied Sciences Upper Austria, Research Institute of Molecular Pathology). Uniprot (TrEMBL and Swiss-Prot) protein sequence database (version of May 2016) was used for all searches under Macrophomina phaseolina, Neofusicoccum parvum, Botryosphaeria dothidea, and L. theobromae. Database search parameters were as follows: carbamidomethylation and carboxymethyl of cysteine as a variable modification as well as oxidation of methionine, and the allowance for up to two missed tryptic cleavages. The peptide mass tolerance was 10 ppm and fragment ion mass tolerance was 0.05 Da. To achieve a 1% false discovery rate, the Percolator (version 2.0, Thermo) node was implemented for a decoy database search strategy and peptides were filtered for high confidence and a minimum length of six amino acids, and proteins were filtered for a minimum number of peptide sequences of 2 and only rank 1 peptides.

The subcellular localization of the identified proteins was deduced using Bacello (Pierleoni et al., 2006), as described before for Botryosphaeriaceae fungi (Fernandes et al., 2014) and function was obtained from Uniprot records.

### Cytotoxicity Assay

In vitro cytotoxicity evaluation was performed as described earlier (Cruz et al., 2013; Duarte et al., 2015) with slight modifications. Each strain was grown in PDB medium at 25,

30, and 37◦C for 72, 96, and 120 h. The supernatants were filtered (0.20 µm pore size filter, Orange Scientific) and used to assess cytotoxicity. A Vero cell line (ECACC 88020401, African Green Monkey Kidney cells, GMK clone) was grown and maintained according to Ammerman et al. (2009). The microtiter plates were incubated at 37◦C in 5% CO<sup>2</sup> for 24 h. After cell treatment, the medium was removed by aspiration and 50 µL of DMEM with 10% resazurin (0.1 mg.mL−<sup>1</sup> in PBS) was directly added to each well. The microtiter plates were incubated at 37◦C in 5% CO<sup>2</sup> until reduction of resazurin (Al-Nasiry et al., 2007). The absorbance was read at 570 and 600 nm wavelength in a microtiter plate spectrophotometer (Thermo scientific, Multiskan Spectrum).

### RESULTS AND DISCUSSION

### Radial Growth and Biomass

Both strains were unable to grow at 5 and at 40◦C and showed maximum radial growth at 30◦C on PDA (considered the best growth conditions for these strains from this point forward). Czapek medium was the least adequate to the growth of L. theobromae (**Supplementary Figure S1**).

Lasiodiplodia theobromae biomass was determined (growth in liquid media; **Figure 1**). Both strains exhibit a biomass increase until 4 or 5 days of incubation; after this period, both strains start to degenerate with the consequent loss of biomass. The maximum biomass was obtained at 25◦C. The biomass growth profile at 25, 30, and 37◦C were significantly different (**Figure 1**, two-way Anova, p < 0.001). Strain CBS339.90 exhibited a similar growth pattern when compared with CAA019, with higher growth rates at 25 and 30◦C although its biomass values were significantly higher (two-way Anova, p < 0.001) than those of the CAA019.

### Extracellular Enzymatic Activity

The extracellular enzymatic activity of L. theobromae was detected and quantified by plate assay and by zymography. Several extracellular enzymatic activities were tested at 25, 30, and 37◦C by plate assay. The activities assayed are involved in the degradation of plant cell walls (as is the case of

cellulases, xylanases, laccases, and pectinases), in the degradation of plant defenses (proteases) and in animal pathogenesis (gelatinases and ureases). Positive activities were evaluated by zymography.

Due to the nature of the hosts – C. nucifera for CAA019 and a human patient for CBS339.90 – the zymographies were performed at 25 and 37◦C, to investigate the effect that human body temperature could have on the strains. Both strains were able to secrete all enzymes assayed (**Figure 2**). Nonetheless, CBS339.90 displayed higher enzymatic activity than strain CAA019 in most conditions tested (**Figure 2**). Only one exception was detected; at 37◦C CAA019 had a higher cellulolytic activity than CBS339.90 grown at the same temperature (p < 0.001; **Figure 2D**). Zymography analysis confirms the data obtained by plate assay. One exception was the cellulolytic activity of L. theobromae, which was higher at 25◦C when analyzed by zymography, but not by plate assay. The difference observed could be related to the short culture time of L. theobromae in the plate assays.

As seen in **Figures 2** and **3,** CAA019 and CBS339.90 express extracellular proteolytic, amylolytic, cellulolytic and xylanolytic enzymes. Most enzymes have very high (≈200 kDa) or very low (≈4.2 kDa) apparent MWs. These MWs could correspond to aggregation or multimeric forms of these enzymes or to degraded peptides with enzyme activity, rather than the MW of the discrete enzymes. In fact, the MWs of the enzymes identified by mass spectrometry (**Table 1**) are in the range of 26.9–58.7 kDa. We have observed a strain-, temperature-, and time-dependent expression pattern of many of these enzymes, particularly cellulases (**Figure 3**-M/N and O/P).

Fungal pathogens have a detrimental impact on plant production and the strategies they use to infect their hosts should be investigated to predict their behavior and protect the plants from fungal infections (Gonzalez-Fernandez and Jorrin-Novo, 2012; Fernandes et al., 2014). Fungal pathogenicity results in the synthesis of molecules, such as CWDEs, inhibitory proteins and toxins, that have been described as being involved in the infection mechanisms of fungi (Kikot et al., 2009). We expect that a plant

adapted pathogen will secrete enzymes able to degrade plant specific substrates (as is the case of CWDEs) while an animal adapted strain will secrete a lower number (or exhibit a lower activity) of these enzymes. On the other hand, we expect that a strain adapted to animal environment will express a higher number of enzymes able to interact with mammalian specific substrates.

Cellulose, hemicellulose, and lignin are the major components of the plant cell wall, cellulose being the most abundant (Gibson et al., 2011). The ability of fungi to produce CWDEs facilitate fungal penetration (Gibson et al., 2011). There are some evidences that plant pathogens may produce different amounts of specific CWDEs depending on the plant host (van der Does and Rep, 2007; King et al., 2011). In this context it is curious that both strains have such distinct cellulolytic activity profiles. It is plausible that strain CBS339.90 may have developed some type of adaptation to colonize human hosts. For example the high protein-content matrix (without complex carbohydrates like cellulose) may have contributed to the high secretion of proteases by CBS339.90.

### Secretome Analysis

The secretome of both strains of L. theobromae grown at 25, 30, and 37◦C (**Figure 4**) was analyzed by SDS-PAGE/LC/MS/MS. Approximately 77% of the selected proteins were identified (10 proteins were identified for CAA019 strain and 11 for CBS339.90 strain); most of proteins are extracellular enzymes (87.5%) and only 12.5% are extracellular proteins with non-enzymatic functions (**Table 1** and **Supplementary Table S1**).

Due to the lack of genome sequence, all proteins were identified based on their homology with the proteins of M. phaseolina MS6 and N. parvum UCRNP2, two pathogenic members of the family Botryosphaeriaceae, whose genome is sequenced and integrated into UniProtKB (Islam et al., 2012).

The strains showed different protein profiles, which were affected by growth temperature, suggesting different interactions with the environment. For each strain some unique proteins were found.

For CAA019 we identified four strain-specific proteins – a glucose-methanol-choline oxidoreductase (K2RRJ6), a putative choline dehydrogenase protein (R1E7Q5), a β-galactosidase (K2SSA3) and a putative f5/8 type c domain protein (R1GH64). From these, the three enzymes are known to be involved in cellulose degradation (Yoshida et al., 2002; Van den Brink and de Vries, 2011), which is expected, since this is a phytopathogenic strain. However, the putative f5/8 type c domain protein, found only at 30◦C, is a coagulation factor. The coagulation factor expressed by CAA019 possesses a functional domain that promotes binding to anionic phospholipids (Hunter et al., 2012), known to be involved in pathogenesis for some species of fungi (Chaffin et al., 1998).

For CBS339.90, also four strain-specific proteins were identified – a putative tripeptidyl-peptidase 1 protein (R1GTC8), a phosphoesterase (K2RUW5), a putative glucan endoα-glucosidase agn1 protein (R1GU94) and a glycoside hydrolase family 71 (K2R498). It is important to highlight that the putative tripeptidyl-peptidase 1 protein was found only at 25◦C. This protease has a serine-type endopeptidase activity (Hunter et al., 2012). In Aspergillus fumigatus, it is part of a set of proteases (sedolisins) that have the ability to degrade proteins at acidic pH values. This allows the generation of assimilable nitrogen in decomposing organic matter and composts (Reichard et al., 2006). Also, it is responsible to acidify the culture supernatant of this species in vitro, which can be related with the acidification of its microenvironment in the living host to facilitate nutrition and proliferation of the hyphae during the infection process (Reichard et al., 2006). The glycoside hydrolases are typically produced by phytopathogenic fungi to degrade cellulose and xylans of the plant cell wall and penetrate into the host tissue (Murphy et al., 2011). These enzymes act hydrolyzing the glycosidic bonds between two or more carbohydrates or between a carbohydrate and a non-carbohydrate moiety. The GH family 71 was expressed only by CBS339.90 and comprises the α-1,3-glucanases (Hunter et al., 2012).

Other families of GH were found in both strains, as the GH family 10, that includes xylanases and cellobiohydrolases and GH family 17, that comprises enzymes as endo-1,3 β-glucosidades, lichenases and exo-1,3-glucanases (Hunter et al., 2012). Glucoamylases whose function is also related with the degradation of plant cell wall, by hydrolyzing 1,4-α-glucose (Hunter et al., 2012; Kubicek et al., 2014) are also expressed by both strains.

Several aspartic proteases are expressed by L. theobromae. Aspartic proteases from the family A1 and a putative a chain endothiapepsin are expressed by both strains. Besides its involvement in physiologic cellular functions, aspartic proteases play a crucial role as virulence factors, dissemination, and host evasion (Rawlings and Bateman, 2009). These enzymes have been related to human pathogenesis (Monod et al., 2002; Yike, 2011), which is concordant with the fact that we identified these enzymes at 37◦C. In these cases, aspartic proteases are probably involved in several processes such as the degradation of the extracellular matrix (mainly composed by collagens and other proteins; Duarte et al., 2007, 2016) leading to the progression of the pathogen.

Thus, some of the proteins seem to be involved in plant pathogenesis processes, as is the case of glycoside hydrolases (Murphy et al., 2011), but also in animal pathogenesis processes, as is the case of proteases or aspartic proteases. The plate assay confirmed the presence of these enzymes for both isolates and the zymography analysis, the presence of multiple endoglucanases, xylanases, and proteases (**Figure 3**).

Globally, both strains showed a similar tendency regarding the number of detected bands, with a decrease of the number with increasing temperature. However, CBS339.90 strain presented more detected bands when compared with CAA019 strain (**Figure 4**; **Table 2**).

A Permutational Multivariate Analysis of Variance was employed using R package 'vegan' and the RStudio v 0.98.1103 interface (Oksanen et al., 2015; R Core Team, 2015). It was shown that for the identified proteins (computed through a presence-or-absence matrix), 44% of the variance is explained by the strain factor (R <sup>2</sup> = 0.4467). Contrary, the differences found for fungi growth temperature were not statistically significant,




The subcellular

peptides

(3) is the number of peptides identified per band.

 localization

 was deduced with Bacello (Pierleoni et al., 2006): all proteins were inferred to be extracellular.

 Only proteins with two unique peptides identified were included in the table. The number of

fpls-07-01096 July 29, 2016 Time: 12:4 # 8

suggesting that the strain has a large relevance on the protein profile in this study (p-value < 0.01), which could be related to a host-adaptation of these strains.

### Cytotoxicity

We investigated the influence of temperature on the cytotoxic potential of L. theobromae, strains CAA019 and CBS339.90 to

TABLE 2 | Summary of the proteins identified in L. theobromae strains CAA019 and CBS339.90 at 25, 30, and 37◦C.

Vero cell line. Both strains were cytotoxic under all conditions tested, with the exception of the environmental strain (CAA019) grown at 37◦C (**Figure 5**). Interestingly, temperature had opposite effect on both strains; the cytotoxic effect of CAA019 was detected mostly when cultured at 25◦C, decreasing at 30◦C and being absent at 37◦C (**Figure 5A**). The cytotoxic effect of CBS339.90 (**Figure 5B**) increased with temperature


The symbols represent the presence (+) or the absence (−) of the proteins.

(p < 0.001), reaching 90% of cell mortality when grown at 37◦C. Interestingly, CBS339.90 also showed higher growth rates (**Figure 1**) and extracellular enzymatic activity (**Figure 2**) at 37◦C when compared with CAA019. Proteins typically related with infection mechanisms were found in the secretome of CBS339.90 (aspartic proteases), suggesting that they can be involved in the high level of cell mortality found for this strain.

Host specificity in plant-pathogenic fungi impact their host range. Although it has been suggested that for most Botryosphaeriaceae species there is no host specificity (Esteves et al., 2014), the differences between the secretome profiles of both strains at 25 and 37◦C can be related to adaptation to specific host conditions. While strain CAA019 was isolated from a coconut tree, CBS339.90 was isolated from a human, at approximately 37◦C. Since the optimal growth of this species is between 27 and 33◦C, we can argue that the ability to infect humans may be the result of an adaptation to an increased temperature. This agrees with the fact that only strain CBS339.90 was cytotoxic to Vero cells at 37◦C and that it produces more biomass and extracellular enzymes at this temperature than CAA019.

### CONCLUSION

We showed that temperature modulates the extracellular protein production of strains of L. theobromae found in different ecological niches: a tropical fruit tree and a hospitalized patient. The temperature growth range was wide, between 15 and 37◦C, a feature common for species in the family Botryosphaeriaceae. The extracellular enzymatic activity also varied with fungal growth temperature.

Both strains were able to induce cytotoxic effects in mammalian cells. However, CBS339.90 is more cytotoxic than CAA019, especially at 37◦C, where cell mortality reached 90%.

The ability to grow at 37◦C and the secretion of hydrolytic enzymes, namely of aspartic proteases, at this temperature are typical characteristics of human pathogenic fungi that may be related to virulence (Karkowska-Kuleta et al., 2009). Our data suggests that the colonization of different hosts may lead to strain specificity.

## AUTHOR CONTRIBUTIONS

AA, AC, AE, AD, and PD conceived and designed the experiments. CF, AD, AG, and RV performed the experiments. AE, CF, AD, AG, and RV analyzed the data. AE, AD, and CF wrote the paper. AA, AC, AE, AD, and PD edited and approved the manuscript. All authors approved the final version of the manuscript.

### ACKNOWLEDGMENTS

Thanks are due, to FCT/MEC through national funds, for the financial support of CESAM (UID/AMB/50017 – POCI-01- 0145-FEDER-007638), of QOPNA research Unit (UID/QUI/ 00062/2013), of iBiMED (UID/BIM/04501/2013), of UIDC (UID/IC/00051/2013) and RNEM (REDE/1504/REM/2005 that concerns the Portuguese Mass Spectrometry Network), and the co-funding by the FEDER, within the PT2020 Partnership Agreement and Compete 2020. This study was partially supported by FEDER funding through COMPETE program and by national funding through FCT within the research project ALIEN (PTDC/AGR-PRO/2183/2014 – POCI-01-0145-FEDER-016788). The authors acknowledge FCT financial support to AA (IF/00835/2013), to AE and to AD (BPD/102572/2014 and BPD/46290/2008) and to CF (BD/97613/2013).

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.01096

FIGURE S1 | Effect of temperature and culture medium on the growth of Lasiodiplodia theobromae. Radial growth of L. theobromae, CAA019 (A) and

CBS339.90 (B), grown at different temperatures (between 5 and 40◦C) and in different culture media was determined after 48 h of incubation. Data is presented as average ± standard error. Two-way ANOVA, Bonferroni test, was used to determine the statistical significance compared to 30◦C (∗p < 0.05 and ∗∗∗p < 0.001, for OMA; Up < 0.05 and UUUp < 0.001, for CMA; \$p < 0.05 and \$\$\$p < 0.001, for PDA; #p < 0.05 and ###p < 0.001, for Czapek).

### TABLE S1 | List of identified proteins by Orbitrap LC-MS/MS using

Proteome Discoverer 2.0 software. The table indicates the protein accession

### REFERENCES


number, the information on protein group master (a master protein is the representative of a group of homologous proteins), protein name, the sequence coverage achieved in percentage, number of peptides identified (at least two peptides), number of peptide-spectrum matches (PSM), number or unique peptides (for a certain species), number of protein groups to which the protein belongs, molecular weight (MW) in kDa, the matched peptide sequences obtained by MS/MS (Sequence) and the protein score as given by MS Amanda algorithm, respectively.


with endoprotease or tripeptidyl-peptidase activity at acidic pHs. Appl. Environ. Microbiol. 72, 1739–1748. doi: 10.1128/AEM.72.3.1739-1748.2006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Félix, Duarte, Vitorino, Guerreiro, Domingues, Correia, Alves and Esteves. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpls-07-01096 July 29, 2016 Time: 12:4 # 12

# **The battle in the apoplast: further insights into the roles of proteases and their inhibitors in plant–pathogen interactions**

*Mansoor Karimi Jashni 1,2, Rahim Mehrabi 1,3, Jérôme Collemare 1,4, Carl H. Mesarich 1,5 and Pierre J. G. M. de Wit <sup>1</sup> \**

*1 Laboratory of Phytopathology, Wageningen University and Research Centre, Wageningen, Netherlands, <sup>2</sup> Department of Plant Pathology, Tarbiat Modares University, Tehran, Iran, <sup>3</sup> Cereal Research Department, Seed and Plant Improvement Institute, Karaj, Iran, <sup>4</sup> UMR1345, IRHS-INRA, Beaucouzé, France, <sup>5</sup> Bioprotection Technologies, The New Zealand Institute for Plant and Food Research Limited, Mount Albert Research Centre, Auckland, New Zealand*

#### *Edited by:*

*Maryam Rafiqi, Computomics, Germany*

#### *Reviewed by:*

*Shunyuan Xiao, University of Maryland, USA David Jones, The Australian National University, Australia*

#### *\*Correspondence:*

*Pierre J. G. M. de Wit, Laboratory of Phytopathology, Wageningen University and Research Centre, Droevendaalsesteeg 9, Wageningen 6708 PB, Netherlands pierre.dewit@wur.nl*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 07 May 2015 Accepted: 13 July 2015 Published: 03 August 2015*

#### *Citation:*

*Karimi Jashni M, Mehrabi R, Collemare J, Mesarich CH, and de Wit PJGM (2015) The battle in the apoplast: further insights into the roles of proteases and their inhibitors in plant–pathogen interactions. Front. Plant Sci. 6:584. doi: 10.3389/fpls.2015.00584* Upon host penetration, fungal pathogens secrete a plethora of effectors to promote disease, including proteases that degrade plant antimicrobial proteins, and protease inhibitors (PIs) that inhibit plant proteases with antimicrobial activity. Conversely, plants secrete proteases and PIs to protect themselves against pathogens or to mediate recognition of pathogen proteases and PIs, which leads to induction of defense responses. Many examples of proteases and PIs mediating effector-triggered immunity in host plants have been reported in the literature, but little is known about their role in compromising basal defense responses induced by microbe-associated molecular patterns. Recently, several reports appeared in literature on secreted fungal proteases that modify or degrade pathogenesis-related proteins, including plant chitinases or PIs that compromise their activities. This prompted us to review the recent advances on proteases and PIs involved in fungal virulence and plant defense. Proteases and PIs from plants and their fungal pathogens play an important role in the arms race between plants and pathogens, which has resulted in co-evolutionary diversification and adaptation shaping pathogen lifestyles.

#### **Keywords: cysteine protease, metalloprotease, serine protease, protease inhibitor, chitinase, defence**

### **Introduction**

For successful infection of host plants and establishment of disease, fungal pathogens need weaponry to facilitate penetration, host colonization and uptake of nutrients for growth and reproduction, and at the same time to protect themselves against host defense responses. On the other hand, plants have developed surveillance systems to recognize and defend themselves against invading pathogens. Plant immune receptors recognize conserved microbeassociated molecular patterns (MAMPs) like chitin oligomers released from fungal cell walls during infection. This recognition leads to MAMP-triggered immunity (MTI) and initiates basal defense responses including the activation of structural and (bio)chemical barriers (Jones and Dangl, 2006; Spoel and Dong, 2012). However, adapted plant pathogens have gained the ability to overcome MTI by producing effector molecules that suppress or compromise MTI responses, thereby facilitating effector-triggered susceptibility (ETS; Stergiopoulos and de Wit, 2009). In response, plants have developed an additional layer of defense that enables them to recognize pathogen effectors or effector-modified host targets leading to effector-triggered immunity (ETI; Jones and Dangl, 2006).

Proteases and protease inhibitors (PIs) secreted by pathogens or their host plants have been extensively studied and have been demonstrated to play an important role in ETS and ETI (van der Hoorn, 2008). However, little is known about their role in MTI and related plant basal defense responses. Plant basal defense responses include the induction of pathogenesis-related proteins (PRs) such as antimicrobial chitinases, β-1,3-glucanases and proteases that hydrolyse the fungal cell wall components chitin, glucans, and polypeptides, respectively. The induction of these PR proteins upon plant infection, their antifungal activity, as well as their exploitation in engineering resistance in transgenic plants are very well documented (Wubben et al., 1996; Sels et al., 2008; Balasubramanian et al., 2012; Cletus et al., 2013). An early report in the literature suggested that pathogens might overcome the deleterious effects of plant chitinases by secreting proteases that modified them (Lange et al., 1996; Sela-Buurlage, 1996). This was further supported by recent studies, which indicate that chitinases are targeted by pathogen proteases and protected by PIs (Naumann et al., 2011; Slavokhotova et al., 2014). This encouraged us to review the recent advances on proteases and PIs that play a role in the arms race between plants and their fungal and oomycete pathogens.

### **Plant Proteases and Protease Inhibitors Involved in Basal Defense**

Most PR proteins exhibit direct antimicrobial activities, such as chitinases that degrade chitin present in fungal cell walls. PR proteins play a role in both constitutive and induced basal defense responses (Avrova et al., 2004; Shabab et al., 2008; van Esse et al., 2008). For example, tomato and potato contain basal levels of proteases in their apoplast, including serine proteases like P69, and papain-like cysteine proteases (PLCPs) like Rcr3, which are required for resistance of tomato against *Cladosporium fulvum* (Song et al., 2009), as well as Pip1 (*Phytophthora* inhibited protease 1; Tian et al., 2007; Shabab et al., 2008) and C14, which play a role in the resistance of potato against *Phytophthora infestans* (Kaschani et al., 2010; Bozkurt et al., 2011). After being challenged by pathogens, proteases are induced both locally (Tian et al., 2005) and systemically in the apoplast (Tian et al., 2007; Shabab et al., 2008; Song et al., 2009), which suggests that their activity affects pathogen growth directly or indirectly. Deletion or silencing of genes encoding these proteases enhanced the susceptibility of plants to pathogens, supporting their role in defense responses. Deletion of *Rcr3* increased the susceptibility of tomato to the late blight pathogen *P. infestans* (Song et al., 2009), to the leaf mold pathogen *C. fulvum* (Dixon et al., 2000), and also to the potato cyst nematode *Globodera rostochiensis* (Lozano-Torres et al., 2012). Likewise, silencing of *C14* in *Nicotiana benthamiana* significantly increased their susceptibility to *P. infestans* (Kaschani et al., 2010). These findings suggest that proteases have a determinative role in the execution of defense against plant pathogens.

Plant PIs have also been reported to play a role in plant immunity, through the inhibition of pathogen proteases, or the regulation of endogenous plant proteases (Ryan, 1990; Mosolov et al., 2001; Valueva and Mosolov, 2004; Kim et al., 2009). This has been shown for PIs from barley (*Hordeum vulgare*) against proteases from *Fusarium culmorum* (Pekkarinen et al., 2007), as well as for PIs from broad bean (*Vicia faba*), which inhibited the mycelial growth of several pathogens (Ye et al., 2001). The *A. thaliana* unusual serine protease inhibitor (UPI) was shown to play a role in defense against the necrotrophic fungi *Botrytis cinerea* and *Alternaria brassicicola* (Laluk and Mengiste, 2011). The UPI protein strongly inhibited the serine protease chymotrypsin but also affected the cysteine protease papain (Laluk and Mengiste, 2011). Plants harboring a loss-of-function *UPI* allele displayed enhanced susceptibility to *B. cinerea* and *A. brassicicola*, but not to the bacterium *Pseudomonas syringae*. Also, hevein-like antimicrobial peptides from wheat (WAMPs) were shown to inhibit class IV chitinase degradation by fungalysin, a metalloprotease secreted by *Fusarium verticillioides* (Slavokhotova et al., 2014). WAMPs bind to fungalysin, but are not cleaved by the enzyme due to the presence of a Ser residue between the Gly and Cys residues where cleavage of class IV chitinase by fungalysin normally takes place (Naumann et al., 2011; Slavokhotova et al., 2014). Adding equal molar quantities of WAMP and chitinase to fungalysin was sufficient to completely inhibit fungalysin activity suggesting a higher affinity of the protease to the WAMP than to the chitinase.

Interestingly, some pathogens can also manipulate the transcription of plant PIs to inhibit deleterious effects of plant proteases in their favor. For example, production of maize cysteine proteases is induced during infection by *Ustilago maydis*, but at the same time the fungus induces the production of maize cystatin CC9 that inhibits cysteine proteases to facilitate infection (van der Linde et al., 2012b; Mueller et al., 2013). This suggests an evolutionary arms race in which the infection strategy of the pathogen benefits from the host's antimicrobial defense to suppress its defense responses.

### **Fungal Proteases Targeting Host Defense Proteins**

The arms race between pathogens and their hosts is often explained by recognition of MAMPs or effectors through pattern recognition receptors or resistance proteins, which results in MTI or ETI (Jones and Dangl, 2006). However, several components of basal defense are both constitutive and induced after interaction between MAMPs/effectors and immune receptors. PR proteins provide an excellent example of this. PR proteins are generally stable proteins that often exhibit a basal level of expression, but are also strongly induced after infection (Sels et al., 2008). PR proteins and their antifungal activity have been exploited to improve broad-spectrum resistance in plants. Plants such as tobacco, tomato, potato, peanut, and cacao have been engineered to over-express chitinases alone (Schickler and Chet, 1997; de las Mercedes Dana et al., 2006; Maximova et al., 2006; Iqbal et al., 2012; Cletus et al., 2013) or in combination with other PR proteins in pea and rice (Sridevi et al., 2008; Amian et al., 2011), and showed enhanced resistance to fungal pathogens.

Plant chitinases and especially chitin-binding domain (CBD) containing chitinases play an important role in defense against pathogenic fungi (Iseli et al., 1993; Suarez et al., 2001). Some fungal pathogens such as *C. fulvum* secrete chitin-binding effector proteins like CfAvr4 into the colonized extracellular space of tomato leaves to protect themselves against the antifungal activity of apoplastic plant chitinases (van den Burg et al., 2006). Indeed, CfAvr4 binds to chitin of fungal cell walls, making chitin inaccessible to plant chitinases, thereby preventing hydrolysis by these enzymes (van den Burg et al., 2006). Functional homologs of CfAvr4 have been identified in other Dothideomycete plant pathogens, in which they likely also protect the fungal cell wall against plant chitinases (Stergiopoulos et al., 2010; de Wit et al., 2012; Mesarich et al., 2015). However, many fungal pathogens do not carry homologs of the *CfAvr4* gene in their genome. It appears that some fungi secrete proteases that cleave CBDchitinases. For example, *F. solani* f. sp. *phaseoli* is able to modify chitinases during infection of bean to facilitate host colonization (Lange et al., 1996). Also an extracellular subtilisin protease from *F. solani* f. sp. *eumartii* was reported to modify chitinases and β-1,3-glucanases present in intercellular washing fluids of potato (Olivieri et al., 2002). More recently, it was shown that *F. verticillioides* and other maize pathogens, including *Bipolaris zeicola* and *Stenocarpella maydis*, secrete two types of proteases that truncate maize class IV CBD-chitinases (Naumann, 2011). A fungalysin metalloprotease of *F. verticillioides* was found to cleave within the CBD domain between conserved Gly and Cys residues (Naumann et al., 2011), while a novel polyglycine hydrolase present in many fungi belonging to the family of *Pleosporineae* cleaved within the polyglycine linker present in the hinge domain of class IV chitinases (Naumann et al., 2014, 2015). In another recent study it was shown that the fungal tomato pathogens *B. cinerea*, *V. dahliae*, and *F. oxysporum* f. sp. *lycopersici* secrete proteases that modify tomato CBD-chitinases (Karimi Jashni et al., 2015). For *F. oxysporum* f. sp. *lycopersici*, the synergistic action of a serine protease, FoSep1, and a metalloprotease, FoMep1 (the ortholog of fungalysin from *F. verticillioides*), was required for cleavage and removal of the CBD from two tomato CBD-chitinases (Karimi Jashni et al., 2015). Removal of the CBD from two tomato CBD-chitinases by these two enzymes led to a reduction of their chitinase and antifungal activity. In addition, mutants of *F. oxysporum* f. sp. *lycopersici* lacking both *FoSep1* and *FoMep1* exhibited reduced virulence on tomato, confirming that secreted fungal proteases are important virulence factors by targeting CDB-chitinases to compromise an important component of plant basal defense (Karimi Jashni et al., 2015).

Collectively, the activity of fungal proteases might explain why overexpression of plant chitinases in transgenic plants has not become an effective strategy to obtain durable resistance against fungal pathogens. Secretion of proteases and PIs by pathogens to modify, degrade, or inhibit basal defense proteins might have played an important role during co-evolution with their host plants (Hörger and van der Hoorn, 2013). Therefore, overexpression of chitinases from a heterologous source in transgenic plants might be a more efficient approach to obtain durable resistance against pathogens, as they have not co-evolved with these "foreign" defense proteins.

## **Fungal Protease Inhibitors Targeting Host Proteases**

Plant pathogens also secrete PI effectors to inhibit plant defense proteases and promote disease development. These effectors are targeted to various host compartments (Tian et al., 2009). One such effector, Avr2, secreted by *C. fulvum* during infection, is required for full virulence of this fungus on tomato (Rooney et al., 2005). Avr2 inhibits the tomato apoplastic PLCPs Rcr3 and Pip1 to support growth of *C. fulvum* in the apoplast. Also, plants expressing Avr2 showed increased susceptibility to other pathogenic fungi, including *B. cinerea* and *V. dahliae* (van Esse et al., 2008). Moreover, *A. thaliana* plants expressing Avr2 triggered global transcriptional reprogramming, reflecting a typical host response to pathogen attack (van Esse et al., 2008). Two other PI effectors are the cystatin-like proteins EPIC1 (extracellular proteinase inhibitor C1) and EPIC2B (extracellular proteinase inhibitor C2B), whose expression is strongly induced in the oomycete *P. infestans* during biotrophic growth on tomato leaves (Tian et al., 2007; Song et al., 2009). These PIs selectively target the plant PLCPs Rcr3, Pip1, and C14 in the apoplast of potato and tomato. The EPICs inhibit C14 and possibly other PLCPs over a wider pH range than that observed for Avr2, which only inhibits Pip1 and Rcr3 at pH values occurring in the apoplast where the pathogen grows. In addition, *P. infestans* secretes two serine PIs (EPI1 and EPI10) that target and inhibit the major apoplastic serine protease P69B, likely to decrease its role in defense (Tian et al., 2004, 2005). It was proposed that EPI1 protects EPIC1 and EPIC2B proteins from degradation by P69B (Tian, 2005). Furthermore, the maize pathogen *U. maydis* secretes the cysteine PI Pit2 that strongly inhibits three abundant defenserelated maize cysteine proteases (CP2 and its two isoforms CP1A and CP1B; Van der Linde et al., 2012a; van der Linde et al., 2012b; Mueller et al., 2013). These findings indicate that cysteine and serine PIs secreted by different groups of filamentous fungal and oomycete pathogens, as well as their activity against plant proteases, can compromise plant basal defense responses. A schematic overview of different types of interactions between pathogen and host proteases and PIs at the plant–pathogen interface is presented in **Figure 1**.

### **Proteases, PI Effectors, and Their Role in Receptor-Mediated Host Defense Responses**

The plant immune system is able to recognize pathogen effectors to mount receptor-mediated defense responses. Although the intrinsic function of protease and PI effectors secreted by some pathogenic fungi promote disease through manipulation of host defense, proteases and PI effectors can also be recognized by host immune receptors mediating defense responses. This adaptation and counter-adaptation reflects the arms race between pathogens and their host plants. A clear example of such an evolutionary arms race are the cysteine PIs Avr2 from *C. fulvum* and Gr-VAP1 (*Globodera rostochiensis* Venom Allergen-like Protein) from *G. rostochiensis* that bind and inhibit the tomato cysteine protease

Rcr3pim. The tomato immune receptor protein Cf-2 senses this interaction and mediates the induction of defense responses (Song et al., 2009; Lozano-Torres et al., 2012). Most likely, the interaction causes a conformational change in Rcr3, which is recognized by the Cf-2 receptor (Krüger et al., 2002; Rooney et al., 2005). This hypothesis is supported by the finding that a natural variant of Rcr3 is recognized by Cf-2 in an Avr2-independent manner (Dixon et al., 2000). Moreover, in tomato plants lacking the Cf-2 receptor, targeting of Rcr3 is not sensed and plants are more susceptible to *G. rostochiensis* (Lozano-Torres et al., 2012).

### **Co-evolution Between Plants and Their Pathogens is Reflected by the Numerous Variant Proteases and PIs in the Genomes of Both Organisms**

The genomes of fungal plant pathogens encode predicted proteases belonging to various subfamilies that vary in number between pathogens with different lifestyles. Generally, hemibiotrophs and saprotrophs contain higher numbers of secreted proteases than biotrophs (Ohm et al., 2012). However, these predictions are based on gene numbers and may not be supported by their transcription and translation profiles. For example, *C. fulvum*, which is a biotrophic fungus, has numbers of proteases that are comparable to the phylogenetically closely related hemibiotroph *Dothistroma septosporum* (de Wit et al., 2012). However, likely due to its adaptation to a different host and lifestyle, many *C. fulvum* protease genes are not expressed *in planta* and some have undergone pseudogenization (van der Burgt et al., 2014). Deletion and duplication of protease genes were reported to occur in the genome of the grass endophytic fungus *Epichloë festucae* (Bryant et al., 2009) but their biological implications have not yet been studied.

Adaptation of PI effectors from pathogens to inhibit different host proteases has been observed in several cases. The Avr2 PI of *C. fulvum*, for example, has a high affinity for the host proteases Rcr3 and Pip1 and a low affinity for C14 (Shabab et al., 2008; Hörger et al., 2012). *P. infestans* EPICs have a high affinity for C14 and a low affinity for Rcr3 and Pip1 (Kaschani et al., 2010). Furthermore, *U. maydis* Pit2 inhibits the maize cysteine proteases CP1, CP2, and XCP2, but does not inhibit cathepsin CatB (Mueller et al., 2013). Different types of selection pressure may lead to the circumvention of protease inhibition by PIs. For example, purifying or diversifying selection has been reported for the proteases Rcr3, C14, and Pip1, and has been shown to act at their PI binding sites. Sequencing of the tomato proteases Rcr3 and Pip1 across different wild tomato species has shown that these proteins are under strong diversifying selection imposed by Avr2. For instance, one of the variant residues in the binding site of Rcr3 prevented inhibition by Avr2, indicating selection for evasion from recognition by this inhibitor (Shabab et al., 2008). C14 from solanaceous plants is also the target of EPICs secreted by *P. infestans* and is under diversifying selection in potato and under conservative selection in tomato. This demonstrates that C14 plays an active role in host immunity against this pathogen and variations in the sequence of C14 in natural hosts of *P. infestans* highlight the co-evolutionary arms race at the plant–pathogen interface (Kaschani et al., 2010).

Evolutionary diversification may vary from point mutation to gene deletion or insertion. EPIC1 and EPIC2 are PIs present in *P. infestans*, however their orthologs were lost in *P. sojae* and *P. ramorum* (Tian et al., 2007). *P. mirabilis*, a species closely related to *P. infestans*, is a pathogen of *Mirabilis jalapa*, and secretes the PI PmEPIC1, an ortholog of EPIC1 that inhibits C14 but not Rcr3 (Dong et al., 2014). However, *M. jalapa* secretes MRP2, a PLCP homolog of Rcr3, that is more effectively inhibited by PmEPIC1 than by EPIC1 (Dong et al., 2014). Substitution of one amino acid residue in PmEPIC1 and EPIC1 restored the inhibitory function of PmEPIC1 on Rcr3 and of EPIC1 on MRP2, respectively. These results show that proteases and PIs have played important roles in adaptation of the two *Phytophthora* species to their respective host plants, although the two species diverged only a 1000 years ago (Dong et al., 2014). This is an excellent example for a role of a protease and PI in the arms race between a plant and its pathogen and exemplifies how diversification and adaptation of a protease-PI complex may work at the molecular level.

### **Conclusion and Perspective**

The recent advances reviewed here exemplify determinative roles of proteases and PIs in shaping plant–pathogen interactions. Analyses of genome databases of both plants and pathogens show that these organisms encode numerous proteases and PIs,

### **References**


of which we are just beginning to understand some of their roles. Advanced transcriptome and proteome tools such as RNA sequencing and protease profiling will facilitate identification of important proteases and PIs for further functional analysis. The redundancy of proteases in pathogens is a technical challenge that has so far hampered defining their biological functions. Targeted deletion of one or even two protease genes failed to change virulence of the plant pathogenic fungi *Glomerella cingulata* (Plummer et al., 2004) and *B. cinerea* (ten Have et al., 2010), respectively. Karimi Jashni et al. (2015) could only show decreased virulence of a double protease mutant of the tomato pathogen *F. oxysporum* by a combined biochemical and genetic approach, and using a defined plant enzyme (CBD-chitinase) as a substrate that was presumed to be involved in plant defense. This indicates that multi-gene targeting of protease and PI genes to identify their role in virulence or avirulence remains a challenge in filamentous fungi. Targeting multiple protease and PI genes might also be hampered by lack of sufficient numbers of selection markers for targeted gene replacement. In the latter case multiple protease and PI genes might be targeted by targeted gene silencing.

### **Author Contributions**

MJ and PdW conceived and wrote the review; JC, RM, and CM critically reviewed the manuscript.

## **Acknowledgments**

MJ was financed by the Ministry of Science, Research and Technology (MSRT) of Iran; PdW, JC, RM, and CM were financed by Wageningen University and the Royal Netherlands Academy of Arts and Sciences.

resistance to biotic and abiotic stress agents. *Plant Physiol.* 142, 722–730. doi: 10.1104/pp.106.086140


Jones, J. D. G., and Dangl, J. L. (2006). The plant immune system. *Nature* 444, 323–329. doi: 10.1038/nature05286


Kim, J. Y., Park, S. C., Hwang, I., Cheong, H., Nah, J. W., Hahm, K. S., et al. (2009). Protease inhibitors from plants with antimicrobial activity. *Int. J. Mol. Sci.* 10, 2860–2872. doi: 10.3390/ijms10062860


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Karimi Jashni, Mehrabi, Collemare, Mesarich and de Wit. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Extracellular peptidases of the cereal pathogen *Fusarium graminearum*

Rohan G. T. Lowe\*, Owen McCorkelle, Mark Bleackley, Christine Collins, Pierre Faou, Suresh Mathivanan and Marilyn Anderson

*Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia*

The plant pathogenic fungus *Fusarium graminearum* (Fgr) creates economic and health risks in cereals agriculture. Fgr causes head blight (or scab) of wheat and stalk rot of corn, reducing yield, degrading grain quality, and polluting downstream food products with mycotoxins. Fungal plant pathogens must secrete proteases to access nutrition and to breakdown the structural protein component of the plant cell wall. Research into the proteolytic activity of Fgr is hindered by the complex nature of the suite of proteases secreted. We used a systems biology approach comprising genome analysis, transcriptomics and label-free quantitative proteomics to characterize the peptidases deployed by Fgr during growth. A combined analysis of published microarray transcriptome datasets revealed seven transcriptional groupings of peptidases based on *in vitro* growth, *in planta* growth, and sporulation behaviors. A high resolution mass spectrometry-based proteomics analysis defined the extracellular proteases secreted by *F. graminearum*. A meta-classification based on sequence characters and transcriptional/translational activity *in planta* and *in vitro* provides a platform to develop control strategies that target Fgr peptidases.

#### *Edited by:*

*Delphine Vincent, Department of Environment and Primary Industries, Australia*

#### *Reviewed by:*

*Roger W. Innes, Indiana University Bloomington, USA Zonghua Wang, Fujian Agriculture and Forestry University, China*

> *\*Correspondence: Rohan G. T. Lowe*

*r.lowe@latrobe.edu.au*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 17 August 2015 Accepted: 22 October 2015 Published: 06 November 2015*

#### *Citation:*

*Lowe RGT, McCorkelle O, Bleackley M, Collins C, Faou P, Mathivanan S and Anderson M (2015) Extracellular peptidases of the cereal pathogen Fusarium graminearum. Front. Plant Sci. 6:962. doi: 10.3389/fpls.2015.00962* Keywords: *Fusarium graminearum*, peptidase, protease, orbi-trap, proteomics, secretome, fungi, plant-pathogen

### INTRODUCTION

The first stages of the plant fungal-pathogen interaction occur on epidermal cells, followed by intercellular spaces such as the apoplastic space (Jones and Dangl, 2006). Proteins secreted by pathogens may dictate the interaction on several levels: (1) Degrading enzymes breakdown host macromolecules to provide nutrition for the pathogen (Brunner et al., 2013). (2) Toxin proteins actively disrupt cellular function of the host and kill cells (Ciuffetti et al., 2010). (3) Immune modulator proteins may inadvertently alert the host to pathogen attack preventing colonization, or conversely they may camouflage non-protein elicitors such as chitin to allow continued growth (Dodds and Rathjen, 2010). Indeed, the size and complexity of a fungal secretome is shaped by their lifestyle and ecological niche (Lowe and Howlett, 2012).

Plant pathogens must extract nutrition from the host plant during colonization, and this aspect of the interaction will be the basis of our study. Two of the key nutrients for fungal growth are carbon and nitrogen. Carbon, most often in the form of carbohydrate, is required as a source of cellular energy as well as for growth and remodeling. Nitrogen is required for synthesis of proteins and nucleic acids. Host-derived protein provides a major source of both carbon and nitrogen. These host proteins must be digested into short peptide fragments before import, and this digestion is performed by a suite of proteases that are secreted into the environment. Therefore, a key aspect of the host-pathogen interaction is this interplay between peptidases secreted by the pathogen and the host substrates.

The host is not a passive partner. It actively responds to signs of infection, such as damage-associated molecular patterns (DAMPS). These DAMPS may be created by the action of secreted proteins, for example when peptidase activity releases hydrophobic peptides that are normally sequestered in the native protein (Seong and Matzinger, 2004). Responses include deployment of a range of defense molecules that have evolved to prevent the fungus from establishing an infection. Defense strategies in the plant have been studied from a number of different aspects and there is an abundance of literature on the subject (Jones and Dangl, 2006). Less is known about the proteins that the fungus produces to invade the plant tissue, evade the immune response and utilize plant material as a source of carbon, nitrogen and other essential nutrients.

Research on the proteins secreted by the fungus as virulence factors has focused on small proteins described as fungal effectors, for example, avirulence proteins reviewed in Stergiopoulos and de Wit (2009). The role of peptidases in plant-pathogen interactions has been dominated by classical nutrient acquisition, or catabolic activities, but there are instances of peptidases determining the outcome of a plantpathogen interaction by other mechanisms. The bacterial effector HopN1, from Pseudomonas syringae, is a cysteine peptidase that once secreted into the plant cell will cleave PsbQ, an essential photosynthesis enzyme, and block programmed cell death (Rodríguez-Herva et al., 2012). In addition, the AvrPphB, ORF4 and NopT effectors also from P. syringae cleave themselves following delivery into plant cells to expose peptides containing fatty acid acylation motifs (Dowen et al., 2009). Acylation of these sites controls targeting to the plant plasma membrane and their avirulence activity. Among the oomycetes, the soybean pathogen, Phytophthora sojae, secretes a class of endoglucanase inhibitor proteins (GIPs) (Rose et al., 2002). These proteins share sequence similarity with serine peptidases, yet lack proteolytic activity due to mutation of the catalytic triad. A final example is the Avr-Pita avirulence protein from the rice blast fungus Magnaporthe oryzae, which has sequence similarity to metallopeptidases in the M35 clan. This peptidase-like protein is secreted and subsequently detected by the rice receptor protein Pi-ta (Jia et al., 2000). This interaction triggers a host defense response and leads to host resistance. Studies on the human pathogen Candida albicans have revealed that proteomic analysis leads to the identification of secreted proteins that were not identified by other methods (Gil-Bona et al., 2015). Similarly, studies in the model fungus Saccharomyces cerevisiae revealed that many of the proteins present in the secretome lacked the typical signal sequences required for annotation as secreted molecules (Giardina et al., 2014). From this evidence, it is clear that a multi-faceted approach is needed to describe the full secretome of a fungal pathogen and define the role it plays in plant-pathogen interactions.

Fusarium graminearum (aka Gibberella zeae) is a widespread pathogen of cereals such as wheat, barley, and corn, which infects the floral tissues and stems of plants and causes major economic losses to growers (Goswami and Kistler, 2004). As well as reducing grain yields, F. graminearum produces mycotoxins such as deoxynivalenol in the infected grain that pollute food and feed supplies (Sobrova et al., 2010). Mycotoxin levels in grain are strictly monitored and the presence of mycotoxins severely restricts market options for growers. The burden of this disease was made starkly apparent by an epidemic of Fusarium head blight in the northern great plains and central USA region over the 1998–2000 seasons, which caused an estimated 2.7 billion dollars of economic impact (Nganje et al., 2002).

For nutrition to be accessible, high molecular weight molecules must be degraded with a variety of extracellular hydrolases to produce low molecular weight products for import into the cell. A full understanding of the proteins secreted by F. graminearum will provide key targets for control of this fungus. Indeed, peptidase inhibitors have already been reported to be upregulated in two wheat cultivars that are resistant to Fusarium head blight (Gottwald et al., 2012). Specifically, a Bowman-Birk type peptidase inhibitor (gene ID Ta.21350.2.S1, MEROPS family I12) was reported that is predicted to inhibit serine peptidases of the MEROPS S1 family. In addition, a gene encoding a subtilisinlike serine protease inhibitor (Ta.22614.1.S1) was specifically upregulated in the resistant cultivars "Dream" and "Sumai-3" during head blight infection. Knowledge of the fungal peptidase targets for these inhibitors may improve the selection of resistant wheat cultivars.

Over 20 studies have identified proteins from Fusarium as well as proteins of Triticum species and barley produced during infection of floral tissues, as reviewed by Yang et al. (2013). Among these, the most exhaustive range of conditions for protein production was reported by Paper et al. (2007). They identified 289 secreted F. graminearum proteins using a linear ion-trap quadrupole mass-spectrometer analysis of in vitro secreted proteins as well as extracts from infected plant tissues (Paper et al., 2007). Many studies were limited by either low protein abundance in planta, low sensitivity of 2D-gel formats, or low sample replication.

We set out to use a systems biology approach to define the extracellular proteome of F. graminearum to facilitate discovery of critical targets for control of diseases such as Gibberella stalk rot of corn, or Fusarium head blight of wheat. Here, we combined public gene expression data for F. graminearum with our own high sensitivity LTQ Orbitrap MS/MS analysis to extend the known secretome of this agriculturally important pathogen. As nitrogen is one of the key nutrients the fungus must release from the host plant, three different nitrogen sources were compared to reveal an extended range of secreted peptidases tailored to each condition.

### METHODS

### Growth and Maintenance of Fungal Cultures

F. graminearum 73B1A (kindly provided by Dr Gusui Wu, Pioneer Dupont, USA), was routinely cultured on synthetic nutrient poor agar (SNA) at 25◦C with a 14 h light:dark cycle.

### Culture Media for Proteomics

F. graminearum 73B1A was grown in four different culture media for proteomics studies. Half-strength Difco potato dextrose broth (1/2 PDB) was used as a complex medium, and three defined media were prepared with differing nitrogen sources. The defined media are derivatives of Czapek-dox medium (Czapek, 1902, 1903; Dox, 1910) with varied nitrogen sources. The basal composition was glucose (10 g/L), di-potassium phosphate (1 g/L), magnesium sulfate (0.5 g/L), potassium chloride (0.5 g/L), and ferrous sulfate (0.01 g/L) at pH 5.1. Nitrogen was added to ensure a molar carbon:nitrogen ratio of 20:1, nitrate medium (NO3) contained sodium nitrate (16.5 mM, 1.4 g/L), glutamine medium contained glutamine (9 mM, 1.31 g/L), minus N medium had no nitrogen. 1/2 PDB was sterilized by autoclaving, while defined media were filter sterilized and used within 2 days.

### Culture Conditions for Proteomics

Samples for secreted proteomics analysis were created as follows. F. graminearum 73B1A conidia were added to 1/2 PDB (500 mL at 1.5 × 10<sup>4</sup> spores/mL) and grown at 20◦C and 85 rpm for 1 day. This master culture was split into 12 aliquots of 40 mL. The hyphae were collected by centrifugation (3220 g, 15 min), and were washed three times with the growth medium. Washed hyphae were resuspended in 50 mL of medium and grown for 1 day at 20◦C, 85 rpm. The culture was filtered with a 0.22µM low-bind VWSP filter disc (Millipore) to separate hyphae from culture medium. The filtrate was concentrated down to 4 mL with a 3 kDa ultrafiltration column (Amicon ultra, Millipore) prior to protein precipitation. Three biological replicates were prepared in parallel. A single cellular proteome control sample was prepared in the same way as the secreted proteome samples, except that hyphae were retained from the 1 day 1/2 PDB culture and were washed 3× with sterile water before they were snap frozen in liquid nitrogen and freeze-dried. Dried hyphae (20 mg) were added to 400µL of urea extraction buffer (1% SDS, 8 M urea, 10%glycerol, 25 mM Tris HCl pH6.8, 1 mM EDTA, 0.7 M mercapto-ethanol) and glass beads (0.5 mm dia.) equal to 1/4 the hyphal volume before homogenization at 30 Hz for 30 s in a mixer mill (Qiagen). The sample was boiled for 2 min and then cooled on ice. The recovered supernatant formed the cellular proteome crude extract.

### Precipitation of Secreted Proteins

Trichloroacetic acid (1 mL of 6.1 N) was added to 4 mL of crude secretome filtrate, mixed, and incubated at 4◦C overnight. It was then centrifuged at 13,000 g for 20 min at 4◦C to pellet the proteins. The supernatant was removed and the pellet was washed with 800µL of ice-cold acetone, by vortexing, centrifugation (13,000 g, 20 min) and removal of the acetone. The acetone wash was repeated twice. Each acetone-washed pellet was agitated in 100µL resuspension buffer (8 M urea, 10 mM dithiothreitol) at 30◦C until fully dissolved, before protein levels were compared by image analysis after SDS-PAGE and SYPRO ruby staining. The cellular proteome was processed with the same method as the secretome except acetone was used in the initial precipitation instead of trichloroacetic acid.

### In Solution Trypsin Digest for Proteomics

Protein was precipitated from extracts with 5 volumes of acetone, washed in acetone, and resuspended in digest buffer (8 M urea, 50 mM ammonium bicarbonate, 10 mM DTT) before incubation at 37◦C for 30 min. Iodoacetamide was then added to 55 mM to alkylate thiol groups (45 min, dark, and 20◦C). The alkylated preparation was diluted to 1 M urea with 25 mM ammonium biocarbonate (pH 8.5) before sequencing grade trypsin (Promega) was added to 5µM final concentration. Digests were performed overnight (37◦C) with shaking to produce tryptic peptides. Tryptic peptides were acidified with 1% formic acid (v/v).

### Solid Phase Extraction Clean-up of Tryptic Peptides

Tryptic peptides in 1% (v/v) formic acid were centrifuged at 18,000 rcf for 2 min before application to a solid phase extraction column (1cc Oasis HLB, Waters) that had been conditioned with 800µL of buffer A [80% (v/v) acetonitrile, 0.1% (w/v) trifluroacetic acid], followed by 1000µL of buffer B (0.1%trifluroacetic acid). After application of the tryptic peptides, the column was washedin buffer B, before the peptides were eluted in 800µL of buffer A, and concentrated to 100µL final volume for mass spectrometry analysis.

### ESI–LC–MS/MS

Peptides (2µL) were diluted to 30µL in 0.1% trifluroacetic acid and 2% acetonitrile (buffer A) were loaded onto a trap column (C18 PepMap 100µm i.d. × 2 cm trapping column, Thermo-Fisher Scientific) at 5µL/min for 6 min and washed for 6 min before switching the precolumn in line with the analytical column (Easy-Spray 75µm i.d. × 50 cm, Thermo-Fisher Scientific). The separation of peptides was performed at 250 nL/min using a linear acetonitrile gradient of buffer A and buffer B (0.1% formic acid, 80% acetonitrile), starting from 5% buffer B to 60% over 300 min. Data were collected on an Orbitrap Elite (Thermo-Fisher Scientific) in Data Dependent Acquisition mode using m/z 300–1500 as MS scan range. CID MS/MS spectra were collected for the 20 most intense ions. Dynamic exclusion parameters were set as follows; repeat count 1, duration 90 s, the exclusion list size was set at 500 with early expiration disabled.

Other instrument parameters for the Orbitrap were as follows: MS scan at 120,000 resolution, maximum injection time 150 ms, AGC target 1 × 10<sup>6</sup> , CID at 35% energy for a maximum injection time of 150 ms with AGT target of 5000. The Orbitrap Elite was operated in dual analyser mode with the Orbitrap analyser being used for MS and the linear trap being used for MS/MS.

### Proteomics Database Searches

All searches were made against the F. graminearum PH-1 (FG3) predicted proteome annotation from the Broad institute ("Fusarium Comparative Sequencing Project, Broad Institute of Harvard and MIT")<sup>1</sup> . Protein sequences from the cRAP database of common lab contaminants (www.thegpm.org/crap)

<sup>1</sup>Fusarium Comparative Sequencing Project, Broad Institute of Harvard and MIT [WWW Document], n.d. URL http://www.broadinstitute.org/.

were added to the database. Decoy sequences were included for all searches.

Label-free quantitation (LFQ) of protein abundance was performed with MaxQuant software and the Andromeda search engine (Cox et al., 2011, 2014). Default settings were used, with "Match between runs" and "requantify" turned on. Both PSM and Protein false discovery rates were set to 0.01. Search engine variable modifications parameters were: oxidized methionine, N-terminal acetylation. The fixed modifications used were: carbamidomethylation of cysteine, precursor ion mass tolerance of 20 ppm (initial search), 10 ppm (second search) and fragment ion mass tolerance of 0.5 Da.

High sensitivity qualitative searches of the cellular control MS/MS spectra were performed using Search GUI and Peptide shaker (Vaudel et al., 2015) within the Galaxy environment (Boekel et al., 2015). Input mgf peak lists were processed by X!Tandem, MS-GF+, and OMSSA using the same parameters as described above for MaxQuant. Searches were performed and combined with SearchGUI before being passed to Peptide shaker to process the output and produce a single combined analysis. Peptide shaker was run with a false discovery rate of 1% at the protein, peptide and PSM level. MzidentML files were created and protein reports exported from Peptide Shaker with the final protein identifications. Proteomics spectra files and protein identifications were deposited at the EBI PRIDE archive (http://www.ebi.ac.uk/pride/archive/) under project accessions PXD002786 and PXD002840.

### Bioinformatics

Microarray transcriptome datasets were downloaded from the PLexDB database (www.plexdb.org), experiments FG01, FG02, FG05, FG06, FG07, FG10, FG11, FG12, FG13, FG14, FG15, FG16, FG18, FG19 were used in this analysis, see **Table 1** for a summary of the microarray details. RMA gene expression values were log base10 transformed and imported to the R statistical software environment. Clustering and heat map plots were performed using the heat.map.2 R module. Euclidean-distance completelinkage trees were produced for each axis of the heat map.

Sequence-based prediction of secretion was performed using a three stage process similar to a previously published example (Brown et al., 2012). Firstly, SignalP4.1 (Petersen et al., 2011) was used to predict signal peptide presence, secondly TmHMM2.0 (Krogh et al., 2001) was used to identify trans-membrane regions, and thirdly WolfPsort (Horton et al., 2007) was used to predict likely cellular location. To be included in our "high confidence secretion" cohort a sequence had to include a signal peptide, have no trans-membrane regions outside of the signal peptide, and score 17 or more for the "extracellular" classification on WolfPsort.

Principal components analysis (PCA) was performed using R software and the prcomp function. Briefly, MaxQuant labelfree quantitation abundances (LFQ) for each replicate of the four secretome treatments were extracted from the MaxQuant proteinGroups.txt output file and used as input data. The input data matrix was Log(2) transformed and quantile normalized in R before PCA was performed on only the 134 protein highconfidence secretome. Missing values were substituted with the TABLE 1 | *F. graminearum* microarray transcriptomics resources.


minimum value for that sample prior to PCA. Optional settings were left as the default settings, including: rotated variables, zero centered, no scaling.

Significance testing was performed using the limma package (Linear Models for Microarray data) within R (Ritchie et al., 2015). Briefly, MaxQuant LFQ abundance values for each protein were log<sup>2</sup> transformed, then quantile normalized, before a limma model was fitted. All possible treatment contrasts were performed and the eBayes function was used to calculate a moderated Fstatistic of overall significance for each protein. P-values were also calculated and Benjamini–Hochberg correction for multipletesting applied.

### RESULTS AND DISCUSSION

The MEROPS peptidase catalog for the Fusarium genome was utilized to provide the potential complement of peptidase genes. Four hundred peptidase-encoding genes were recorded in the F. graminearum genome by MEROPS (Rawlings et al., 2013). The majority of peptidases had serine (43%), metallo (28%), or cysteine (17%) nucleophiles. Threonine (6%), aspartic acid (5%), and glutamic acid (<1%) nucleophiles comprised the remainder of the peptidases (**Table 2**).

### An Aggregated Transcriptome Profile for Peptidases

The F. graminearum affymetrix microarray platform and associated expression data were mined for a range of conditions capturing in planta disease, in vitro growth, sporulation, or mycotoxin production. Fourteen different experiments including 183 microarrays were combined to produce a transcriptomic profile for a total of 389 peptidases present on the array. A heatmap and dendrogram were calculated to group peptidases on the basis of their transcriptional expression profile (**Figure 1**, **Table S1**).

TABLE 2 | Peptidases identified from cellular and secreted proteomics data.


Seven clusters of microarrays and seven clusters of peptidases were formed and average expression values calculated (**Figure 2**). The microarray experiments clustered largely as expected: array cluster A1 contained only in vitro mycelial growth arrays, A2 contained only barley floral infection arrays from 3 to 6 dpi, A3 contained mock inoculated controls and early stage floral infections for wheat and barley, A4 captured conidia germination in vitro and wheat head blight (2–8 dpi) and all wheat coleoptile infection arrays, A5 contained entirely in vitro growth arrays including sexual development on carrot agar and CMC medium, plus carbon starvation. A6 contained all of the wheat crown rot arrays, A7 contained only wheat floral infection arrays, including those tracking sexual development and all 4 dpi time points. These groupings were biologically relevant and distinguished growth on wheat vs. growth on barley, as well as sporulation and mycelial stages of growth, and growth during infections of crown and flowers.

The transcriptome profiles revealed two major groups of peptidases, high expression and low expression groups. Within those major types more refined clustering revealed peptidase genes that were regulated by environmental conditions, such as repression during in planta growth and those induced during mycotoxin biosynthesis. Peptidase cluster P1 had generally low expression, cluster P2 had higher expression during infection of flowers, and cluster P3 contained peptidases with medium-low expression. Cluster P4 and P5 had high expression. Expression in cluster P6 was generally low during growth in planta, and cluster P7 was high in both in vitro and in some in planta studies.

Peptidase nucleophile abundance varied between clusters (**Table 3**). Serine peptidases were overly represented in cluster P3 and to a lesser extent in clusters P1 and P2. Cysteine peptidases were enriched in cluster P6 and P7, while threonine and metallo peptidases were enriched in cluster P5. Aspartic peptidases were found in the smallest cluster, P4. Enrichment was not calculated for other rare nucleophiles due to insufficient numbers of peptidases in these categories. Serine peptidases were most commonly expressed at low to medium levels, while aspartic peptidases were generally transcribed constitutively at a high level.

The microarray analysis revealed that there was regulation of peptidase expression in response to growth in planta. We sought to further refine our analysis with proteomics analysis of the secreted proteome to determine the most abundant peptidases used for nutrient acquisition.

## A Shotgun Proteomics Approach to Identify Secreted Peptidases

A shotgun proteomics approach was used to identify extracellular peptidases of F. graminearum in culture. Using an ultra-high resolution linear ion-trap Orbitrap mass spectrometer an initial control sample of cellular proteins from a PDB culture produced over 2000 protein identifications via Peptide shaker software. This number reduced to 1743 robust identifications, with the requirement that each protein was matched with at least two validated peptides (**Table 4**, **Table S2**). The secretome was then queried through replicated analysis of four in vitro culture conditions: one complex plant-derived medium consisting of half-strength PDB and three defined media based on Czapek dox medium. A complete medium using peptone or tryptone was not included because we wanted to avoid pre-hydrolysis of proteins in the test medium. The defined media were identical except for substitution of the nitrogen source to one of nitrate (NO3), glutamine, or a nitrogenfree (Minus N) composition. These three sole nitrogen sources were chosen to examine an inorganic N source (NO3), an amino nitrogen source (glutamine) that provided both carbon and nitrogen, and a nitrogen-free medium to induce a nitrogen starvation response. An early-growth stage was selected to minimize cellular auto-lysis and mimic initial infection processes.

LFQ of the secretome samples revealed large changes in protein abundance between treatments, which would have been unrecognized using more traditional qualitative assessments. MaxQuant LFQ identified 874 unique proteins across all secretome samples, with 676 present in a minimum of 2 of 3 replicates (**Tables S3**, **S4**). A further 261 of those were absent from the cellular control sample. We used a three-stage bioinformatics prediction to identify protein sequences with the correct signatures of secretion, a signalpeptide, lack of trans-membrane regions, and an extracellular location. Our bioinformatics prediction identified 668 proteins as likely to be secreted (**Table S5**). This number is higher than the 574 previously reported (Brown et al., 2012) due to our omission of a GPI-anchor prediction. The presence of a GPI anchor was considered insufficient evidence to exclude a protein from the secretome for two reasons. Firstly, the entirety of a GPI-anchored protein would still be external to the plasma membrane under our criteria, and secondly, GPI anchors may be cleaved to fully release proteins to the environment. When we added the bioinformatics filter to our 261 secretome proteins, just 134 remained (**Figure 3A**).

### A 134 Protein High-confidence Secretome

This strict filtering process should almost eliminate false positive secretome identifications. These 134 proteins formed the highconfidence secretome set. In total, 2025 unique proteins were found with high confidence in either the cellular sample or secretome samples. A previous study of a wide range of in vitro conditions and in planta apoplastic fluid identified 289 proteins (Paper et al., 2007). We have identified 153 of those 289 proteins including 43 previously considered in planta-only. By sampling

the in vitro proteome to a deeper level, we have captured a significant proportion of proteins previously reported as in planta-specific.

The cellular proteome cohort had relatively fewer serine peptidases then the full genome prediction (22 vs. 44%), there were more metallo peptidases identified (38 vs. 28%) in the cellular proteome, and roughly the same percentage of cysteine peptidases in both groups (**Table 2**). The complement of exclusively secreted peptidases contained a high proportion of metallo peptidases. It is likely that some low abundance peptidases are missing from the proteomics datasets, which could influence the apparent cellular localization.

The PDB culture medium is complex and was likely to contain peptides of potato origin. An additional database search on a PDB culture filtrate was performed including the Solanum tuberosum predicted protein set v1.0.1 (The Potato Genome Sequencing Consortium, 2011), with the specific aim of assessing the residual potato peptide content in PDB medium. We identified three potato proteins in PDB grown samples, a starch synthase, a lipid transfer protein and a cytochrome b5 protein. We estimated that approximately 1% of the total detected peptides were of potato origin, and did not affect our analyses of F. graminearum.

Most of the secretome proteins were produced in more than one condition and 49 proteins were produced under all four conditions (**Figure 3B**). The defined medium with glutamine as the nitrogen source had the most proteins that were unique to one condition, whereas the Minus N condition had no exclusive proteins. Peptidases comprised 19 of the 134

#### TABLE 3 | Peptidase nucleophile type is enriched according to gene expression profile.


*The peptidase nucleophile content of each gene expression cluster (Figure 1) was calculated and compared to the frequency in the genome. The percentage difference observed and expected frequency is shown for the five most common nucleophiles: Asp, aspartate; Cys, cysteine; Met, methionine; Ser, serine; Thr, threonine.*

#### TABLE 4 | Proteins discovered in cellular and secreted treatments.


protein high-confidence secretome, including five subtilizes, three aspartyl peptidases, three metallo-peptidases, and a trypsin peptidase.

### Nitrogen-responsive Peptidases

To determine how protein secretion adapted in response to environmental nitrogen the high confidence secretome cohort of 134 proteins was subjected to PCA (**Figure 4**). Principal components 3 and 4 (PC3, PC4) captured 19.2 and 10.6%of the total variance, respectively, and resolved the four culture conditions tested. The loadings for PC3 and PC4 were examined to extract the most influential proteins for each culture condition, beginning with PDB. FGSG\_08196 and FGSG\_03072 were highly influential for the PDB treatment. FGSG\_08196 is an example of the MEROPS scytalidoglutamic peptidase, an acid-active peptidase found throughout the ascomycete fungi. Inhibition of related glutamic acid peptidases of the thermophilic fungus Talaromyces emersonii significantly retarded hyphal growth on the complex nitrogen source peptone (O'Donoghue et al., 2008). We believe this is the first identification of FGSG\_08196 in a proteomics study. A specific inhibitor of this peptidase may provide excellent control of F. graminearum and head blight disease.

FGSG\_03072 a sedolisin, a part of the MEROPS S53 family of unassigned peptidases, a group containing both acid active endopeptidases and tripeptidyl-peptidases. FGSG\_03072 aligned most closely (49% identity) with SedD, an acidic exopeptidase, and one of four characterized Aspergillus fumigatus sedolisins (Reichard et al., 2006). Blast searches for additional F. graminearum homologs of all the A. fumigatus Sed proteins revealed a SedA endopeptidase homolog (FGSG\_10343) that was not detectable in the proteome, but a second SedB/C/D exopeptidase (FGSG\_12142) was found in the high confidence secretome at high abundance in all four treatments. Interestingly, FGSG\_12142 was not reported in previous proteomics studies

on F. graminearum grown in vitro, in planta, during mycotoxin induction (Paper et al., 2007; Taylor et al., 2008; Rampitsch et al., 2010, 2012).

The minus-N culture medium contained four serine and one metallo peptidase in its influential set of proteins. FGSG\_11164 (a Trypsin homolog), FGSG\_10982 (dipeptidyl-peptidase), FGSG\_03315 and FGSG\_00806 (subtilisin-like peptidases), and FGSG\_01818 (Ap1-like metallo-peptidase). All five of these peptidases were reported by Paper et al. (2007) from in vitro samples and only FGSG\_10982 was not reported in planta.

The glutamine culture medium contained two influential peptidases. FGSG\_03954, a metallo-endopeptidase, and FGSG\_06572 a subtilisin-like serine peptidase. No peptidases were identified as being specific to the NO<sup>3</sup> treatment; this is consistent with nitrogen catabolite repression during growth on a favorable inorganic nitrogen source. We expected equal repression of peptidases for all treatments containing favored nitrogen sources (NO<sup>3</sup> and glutamine) this was the case for these two peptidases, as they were absent from the PDB treatment, where nitrogen catabolism should be de-repressed.

### Cross-referenced Transcriptome and Proteomics Datasets

The peptidase transcriptome data and the proteome were cross-referenced. Cluster P2 of the transcriptome heat map was enriched with peptidases containing extracellular secretion signals. Twenty-nine per cent of P2 peptidases were predicted to be secreted, which was 14% more than the average for the genome. We did not observe enrichment of P2 peptidases in the actual secretome, which was probably due to their expression level falling below the limit of detection for our proteomics analysis. Peptidases in cluster P1 had the lowest average gene expression and this was again reflected in a very low number of identifications with only 7 of 97 P1 peptidases identified in the secretome or cellular proteome. This could be due to presence of pseudogenes that are not translated or more likely to low abundance proteins that fell below the limit of detection of MS/MS identification. The highly expressed clusters P4 and P5 were both enriched for peptidases that were identified in either the secretome or cellular proteome.

We hypothesized that peptidases upregulated during in planta growth would be enriched for signal-peptides and extracellular sequence signatures. Peptidases expressed during in planta growth were mostly found in clusters P2, P4, P5, while P6 contained peptidases that were mostly down regulated during growth in planta. We examined the peptidase gene expression heat map for clusters of peptidases that were likely to be secreted and also had similar gene expression profiles (**Figure 5**). Cluster P2 was almost uniformly predicted as secreted, while only a small subset of P5 was predicted to be secreted in planta. P2 peptidases were up-regulated during infection of barley and wheat flowers, crown rot of wheat, and sexual sporulation in wheat.

### Virulence Factors for Head Blight of Wheat and Barley

The five peptidases influential for minus-N treatment were all classified within cluster P5 on the basis of their microarray profiles, which indicated generally high gene expression. The FGSG\_08196 and FGSG\_03072 peptidases identified in the PDB treatment were present in cluster P1, indicating low expression on average. Closer inspection of the arrays for each gene revealed a few conditions with elevated expression. FGSG\_03072 was only highly expressed during growth on mycotoxin-inducing medium with agmatine or putrescine as a nitrogen source (FG7, FG14), and during sexual development in planta (FG16). FGSG\_08196 was similar with selective high expression during nitrogen starvation, and during in vitro growth in mycotoxin-inducing media, although there was an additional peak in expression during barley floral infection at 72 h that was not repeated during infection of wheat flowers. Both peptidase types are active at

FIGURE 5 | Composite analysis of peptidase secretion and *in planta* expression. Peptidase secretion parameters were calculated for each peptidase shown in Figure 1. Peptidases are ordered by their gene expression profile, as in Figure 1. The peptidase frequency for three peptidase parameters, (1) bioinformatic prediction of secretion "Predicted extracellular," (2) presence in the "Secreted proteome," and (3) presence in the "Cellular proteome" was plotted on the leftmost Y-axis axis as a moving window (window size = 16), for example, if the value is 10, that means 10 out of 16 peptidases in the window were predicted to be secreted based on sequence characters. The right-most Y-axis shows the resultant "Composite prediction" of secretion (black trace), Composite prediction was calculated as Log2 [mean[(1),(2)]/(3)], where a positive result indicates more likely secretion, a negative result less likely secretion. The gene expression clusters of peptidases determined in Figure 1 are shown above the plot.

acidic pH therefore we hypothesize their gene expression peaks in acidic environments. Mycotoxin-induction is known to occur during floral infections in planta and only at acidic pH in vitro (Merhej et al., 2011). The wheat floral tissue may develop a more acidic pH faster than barley, explaining why expression of these two acidic peptidases is higher during wheat infections. The pH of PDB culture medium is 5.1; this mildly acidic environment induced additional acidic peptidases compared to the NO3, minus N, and glutamine conditions. The three defined media were based on Czapek dox and have a neutral pH of 7.3. This may be sufficient to explain the increase in abundance of acid peptidases in the PDB treatment, and affirms the selection of PDB as an in vitro condition to approximate in planta growth.

### Comparison of Head Blight of Wheat and Barley

F. graminearum is a floral pathogen of both wheat and barley. Although this study did not collect in planta secretome data, we considered relationships between the in vitro secretome and the in planta transcriptomics datasets. Transcriptome data for both wheat and barley was examined for differential expression of proteases. Expression group A7 contained exclusively wheat infection samples, indicating there may be co-regulation of peptidases specifically in planta.

We compared peptidase gene expression in barley and wheat microarrays to determine if peptidase expression was regulated by the species of host plant (**Figure 6**). A range of expression levels was identified, but in the vast majority of cases average peptidase expression was consistent between the two host plants with a correlation coefficient (R<sup>2</sup> ) of 0.8672 (**Figure 6A**). Such high correlation is impressive considering the independent nature of the experiments. To confirm that peptidase gene expression can be modulated to suit the environment, we compared wheat infection to in vitro growth on complete medium (**Figure 6B**). We observed large changes in peptidase expression when comparing in vitro with in planta growth, with a correspondingly weak correlation coefficient of 0.2218. This confirmed that peptidase transcription is responsive to the growth environment. Peptidases identified in either the cellular proteome, secreted proteome or both cellular and secreted proteomics samples were mapped onto the correlation charts. No overall patterns of secretion were revealed in relation of gene expression level on barley or wheat, however, secretome peptidases tended to be transcribed more during growth on complete medium than on wheat. The secreted peptidases FGSG\_08196, and FGSG\_10086 were revealed as slightly upregulated during barley infection compared to wheat infection. FGSG\_8196 was identified in the 134 protein high confidence secretome, and was exclusively detected in the PDB medium secretome. FGSG\_10086 is a serine peptidase in the MEROP S33 family in gene expression group P3, most highly expressed in mycotoxin induction medium.

### Non-canonical Secretion

Non-canonical secretion mechanisms may account for unexpected extracellular location of proteins. This study focussed on secreted peptidases, some of which may have been incorrectly regarded as cellular proteins due to non-canonical secretion. We identified 127 proteins that are candidates for non-canonical secretion as they were only found in the secreted proteome but lacked the expected bioinformatic prediction of secretion (**Figure 3A**). This figure is likely an overestimate as our bioinformatics methodology was biased toward false-negative error when assigning classical secretion, but 47 of those 127

FIGURE 6 | Peptidase gene expression in wheat, barley and complete medium. *F. graminearum* peptidase gene expression (RMA values) from microarrays of barley and wheat head blight disease (A) were plotted to reveal the degree of transcriptional gene co-regulation on the two different host plants. Two peptidases that were differentially reguated between the two *in planta* conditions are indicated with an arrow and label. For comparison, gene expression (RMA values) from *in vitro* growth on complete medium were compared to wheat head blight (B). The secretion status determined from proteomics analyses is displayed in the shading of each peptidase value: yellow for cellular proteins, green for proteins found in both the cellular and secreted proteomes, and blue for proteins only found in the secreted proteome.

proteins have a SignalP score of less than 0.15 (the threshold for secretion is 0.45), comprising a more representative list for non-canonical secretion. Superoxide dismutase is a good candidate for non-canonical secretion as it has been reported to be released into culture medium by gentle washing of Claviceps purpurea hyphae, yet lacks a classical signal peptide (Moore et al., 2002). Non-canonical secretion of proteins via extracellular microvesicles, or exosomes, is gaining attention for its potential role in cell-to-cell communication and pathogenesis (Samuel et al., 2015). We identified three homologs of superoxide dismutase in the culture medium proteome, FGSG\_02051, FGSG\_04454, FGSG\_08721. All three were absent from PDB medium and were present in the other three treatments. None contained recognizable signal peptide sequences and two were also found in the cellular proteome, FGSG\_08721 has been previously reported as secreted both in vitro and in planta (Paper et al., 2007). Nine of the 127 candidates for non-canonical secretion were peptidases, including four metallo-peptidases of the M28 family. Of these, FGSG\_01095 and FGSG\_11411 had extremely low signalP scores of 0.131 and 0.103, respectively. Interestingly, the FGSG\_01095 sequence scored higher using the prokaryotic SignalP algorithms, raising the question of whether proteins originating from mitochondria have retained aspects of prokaryotic protein transport.

### Mycotoxin Biosynthesis and Protein Secretion

Brown et al. (2012) suggested that there may be a link between symptomless growth in planta and the co-secretion of trichotheces (including deoxynivalenol, or DON) and virulence proteins. Both their study and this one identified the deoxynivalenol biosynthetic enzyme TRI8 (trichothecene 3-O esterase, FGSG\_03532) as a predicted secreted protein on the basis of its sequence characters. However, we did not identify it in our proteomics survey. This was unsurprising, as deoxynivalenol biosynthesis by F. graminearum requires a low pH, a permissive nitrogen source such as a polyamine or N-starvation. The transcriptional regulator AREA governs genes required for nitrogen metabolism and is also required for full DON biosynthesis in F. graminearum (Hou et al., 2015). The key biosynthetic and regulatory genes, TRI5 and TRI6, respectively, are induced under nitrogen starvation conditions, and supressed by a preferred nitrogen source, such as ammonia. The culture period of our study was very short and would not have resulted in significant DON induction under permissive conditions. Taylor et al. (2008) specifically targeted mycotoxin-induction for their ITRAC proteomics analysis of cellular proteins, and identified three TRI proteins, FGSG\_03534, FGSG\_03535, and FGSG\_03543. We did not find any of the TRI proteins in our cellular controls, nor our secreted protein samples, which may have been due to our early-stage sampling of cultures. We expect that there may have been low levels of TRI proteins in our MinusN sample but they were insufficient for MS/MS detection from culture medium. Deoxynivalenol permissive conditions would also result in vigorous expression of secreted peptidases, which we observed in the MinusN proteomics samples. It may be possible to use the presence of certain secreted peptidases as a highly sensitive enzymatic reporter of deoxynivaleol risk in cereal grains.

### CONCLUSION

This is the first proteomics study to focus on the peptidases of F. graminearum. Degrading enzymes are considered diverse and redundant, and therefore unlikely targets for control of plant pathogens. However, our characterization of the secreted peptidases of F. graminearum revealed deployment of a greatly reduced peptidase subset of limited diversity. We identified a Fusarium homolog of a peptidase required for hyphal growth of a thermophilic fungus, this homolog presents a target for further study to determine its contribution to overall fitness of F. graminearum, and a possible control target.

We have brought together public transcriptomics resources and an in vitro secreted proteomics dataset to extend our knowledge of peptidase production of F. graminearum both in planta and in vitro. A focussed peptidase gene expression analysis revealed seven clusters of peptidases with similar expression profiles during in vitro and in planta conditions. Over 2000 proteins were identified with 890 of those released into culture medium. A high-confidence secretome cohort of 134 proteins was derived that satisfied a three-stage selection process: firstly, presence in the culture medium, secondly, absence from the cellular proteome, and thirdly, satisfaction of three bioinformatics analyses for secretion characters. High sensitivity mass-spectrometry analysis of in vitro extracts allowed extension of the known proteome of F. graminearum. The majority of the high confidence secretome had not been reported in previous proteomics studies and includes proteins previously thought to be restricted to in planta growth. We anticipate this dataset will also allow future refinement of the F. graminearum genome annotation, confirming post-transcriptional processing, and Ntermini of mature proteins.

## AUTHOR CONTRIBUTIONS

RL, MB, MA drafted the manuscript. OM, CC, PF, RL performed experiments. MB, RL, MA, SM conceived the experiments. All authors edited and reviewed the manuscript.

## FUNDING

This work was supported by the Australian Research Council with Discovery Projects to MA (DP150104386), and SM (DP130100535), plus a Discovery Early Career Researcher Award (DE150101777) to SM, and a La Trobe University Understanding Disease Research Focus Area Grant to MB.

### ACKNOWLEDGMENTS

We thank Ira Cooke for discussions and assistance with the proteomics label-free quantitation methodology.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00962

Table S1 | Peptidase microarray data.

### REFERENCES


Table S4 | Limma statistical analysis.

Table S5 | Signal peptide predictions.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Lowe, McCorkelle, Bleackley, Collins, Faou, Mathivanan and Anderson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evaluation of Secretion Prediction Highlights Differing Approaches Needed for Oomycete and Fungal Effectors

Jana Sperschneider <sup>1</sup> \*, Angela H. Williams 1, 2 , James K. Hane3, 4, Karam B. Singh1, 2 and Jennifer M. Taylor <sup>5</sup>

#### Edited by:

*Marc-Henri Lebrun, Institut National de la Recherche Agronomique, France*

#### Reviewed by:

*Guus Bakkeren, Agriculture & Agri-Food Canada, Canada Gregor Langen, University of Cologne, Germany Marc-Henri Lebrun, Institut National de la Recherche Agronomique, France*

> \*Correspondence: *Jana Sperschneider jana.sperschneider@csiro.au*

#### Specialty section:

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

Received: *05 June 2015* Accepted: *07 December 2015* Published: *23 December 2015*

#### Citation:

*Sperschneider J, Williams AH, Hane JK, Singh KB and Taylor JM (2015) Evaluation of Secretion Prediction Highlights Differing Approaches Needed for Oomycete and Fungal Effectors. Front. Plant Sci. 6:1168. doi: 10.3389/fpls.2015.01168* *<sup>1</sup> CSIRO Agriculture Flagship, Centre for Environment and Life Sciences, Perth, WA, Australia, <sup>2</sup> The Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia, <sup>3</sup> Department of Environment and Agriculture, CCDM Bioinformatics, Centre for Crop and Disease Management, Curtin University, Perth, WA, Australia, <sup>4</sup> Curtin Institute for Computation, Curtin University, Perth, WA, Australia, <sup>5</sup> CSIRO Agriculture, Black Mountain Laboratories, Canberra, ACT, Australia*

The steadily increasing number of sequenced fungal and oomycete genomes has enabled detailed studies of how these eukaryotic microbes infect plants and cause devastating losses in food crops. During infection, fungal and oomycete pathogens secrete effector molecules which manipulate host plant cell processes to the pathogen's advantage. Proteinaceous effectors are synthesized intracellularly and must be externalized to interact with host cells. Computational prediction of secreted proteins from genomic sequences is an important technique to narrow down the candidate effector repertoire for subsequent experimental validation. In this study, we benchmark secretion prediction tools on experimentally validated fungal and oomycete effectors. We observe that for a set of fungal SwissProt protein sequences, SignalP 4 and the neural network predictors of SignalP 3 (*D*-score) and SignalP 2 perform best. For effector prediction in particular, the use of a sensitive method can be desirable to obtain the most complete candidate effector set. We show that the neural network predictors of SignalP 2 and 3, as well as TargetP were the most sensitive tools for fungal effector secretion prediction, whereas the hidden Markov model predictors of SignalP 2 and 3 were the most sensitive tools for oomycete effectors. Thus, previous versions of SignalP retain value for oomycete effector prediction, as the current version, SignalP 4, was unable to reliably predict the signal peptide of the oomycete Crinkler effectors in the test set. Our assessment of subcellular localization predictors shows that cytoplasmic effectors are often predicted as not extracellular. This limits the reliability of secretion predictions that depend on these tools. We present our assessment with a view to informing future pathogenomics studies and suggest revised pipelines for secretion prediction to obtain optimal effector predictions in fungi and oomycetes.

Keywords: signal peptide prediction, effectors, protein secretion, fungi, oomycetes, plant pathogens

### INTRODUCTION

The growing number of sequenced fungal and oomycete plant pathogen genomes has enabled detailed reverse genetics studies into molecular pathogen-host interactions (Dean et al., 2012; Kamoun et al., 2014). Though fungi and oomycetes belong to phylogenetically distinct microbial taxa, they both use a diverse class of molecules, termed effectors, to promote pathogenicity through subversion of host defenses or impairment of normal host-cell function (Dodds and Rathjen, 2010; Lo Presti et al., 2015). Effector molecules may be the products of either secondary metabolite or protein synthesis, however the majority of effectors identified in fungi and oomycetes are the latter. Proteinaceous effectors are initially synthesized intracellularly and require relocation to the extracellular space (apoplastic effectors) or subsequent import into the host cell cytoplasm or specific organelles (cytoplasmic effectors). The classical endoplasmic reticulum (ER)/Golgi-dependent secretion pathway in eukaryotes is well-defined and involves recognition of an N-terminal signal peptide that is cleaved off as the protein is translocated across the membrane (Von Heijne, 1990). Classical signal peptides can be predicted computationally with high accuracy (Menne et al., 2000; Klee and Ellis, 2005; Choo et al., 2009; Min, 2010; Melhem et al., 2013), and the majority of experimentally verified fungal and oomycete effectors are predicted to be secreted in this manner. However, reports are emerging for yet unknown, non-classical secretion pathways to also play a role in fungal and oomycete effector externalization (Ridout et al., 2006; Liu et al., 2014). Numerous eukaryotic plant pathogen effectors have been found to be active inside the host cell cytoplasm; however the knowledge of how effectors are delivered into the plant cells after secretion is fragmentary. In oomycete effectors, conserved amino acid motifs such as RXLR, CHXC, or LFLAK are positioned in N-terminal domains and define oomycete effector superfamilies (Petre and Kamoun, 2014). Although mechanisms have been proposed as to how the RXLR motif may facilitate cell entry through the host cell membrane phospholipid bilayer, the results are still controversial (Tyler et al., 2013; Wawra et al., 2013). Conserved sequence motifs associated with translocation have thus far not been found for fungal effector proteins, which makes their computational prediction from secretomes challenging (Sperschneider et al., 2015). A conserved Y/F/WxC-motif has been identified in the N-terminus of effector candidates in the barley powdery mildew fungus (Godfrey et al., 2010), however the role of this motif in cell entry or pathogenicity remains undetermined.

Several studies have exploited proteomics to experimentally identify secreted proteins involved in pathogenicity. For example, an early proteomics study of extracellular proteins of the wheat-infecting fungus Fusarium graminearum identified 120 candidates secreted in planta, of which only 56% possessed a predicted signal peptide motif (Paper et al., 2007). A later study in the same species identified only 69 secreted proteins, following growth in barley or wheat flour-based liquid cultures to mimic host-pathogen interactions (Yang et al., 2012). Of these, 70% possessed a predicted signal peptide. A recent study in the oomycete potato pathogen Phytophthora infestans predicted 80% of its extracellular proteome to contain a signal peptide (Meijer et al., 2014). Thus, there appears to be wide variability (both between species and experiments) in the number of extracellular proteins identified through experimental proteomics that are also predicted to be secreted in silico. A high percentage of proteins lacking a classical signal peptide may be due to contamination of extracellular samples with intracellular proteins, due to rupture of the fungal cells during the protein extraction procedure. Furthermore, protein extraction may be complicated in species where there is a low or variable pathogen biomass relative to the host, or that selectively secrete different proteins when grown in different in vitro or infectionmimicking cultures. Computational limitations also have the potential to complicate proteomics experiments. This may come from variability between species in their use of non-classical secretion mechanisms, which cannot yet be accurately predicted. Gene annotation is also an important determining factor for the reliability of both experimental proteomics and computational prediction of secretion. Proteomic identification of genes is dependent on the completeness and accuracy of translated gene annotations that are used to generate a searchable database of predicted trypsin-digested proteins, to which peptide massspectra are matched (Bringans et al., 2009). Thus, missing or incorrect gene annotations may exclude or confuse identification of extracellular proteins. Prediction of secretion also relies strongly on the presence and accurate annotation of the 5′ exons of genes, which encode N-terminal signal peptides. Due to these technical difficulties, deriving accurate computational predictions of secreted proteins from whole genome sequences remains an important pursuit in plant pathology, with a view toward efficient identification of secreted proteins for subsequent effector prediction.

The apparent ease of secretion prediction has led to its common use in pathogenomic studies as a first pass filter in narrowing down a whole proteome dataset into a short-list of potential effector candidates (Kämper et al., 2006; Raffaele et al., 2010; Rouxel et al., 2011; Hane et al., 2014; Nemri et al., 2014). A variety of software tools exists for eukaryotes that can predict whether proteins are secreted into the extracellular environment (Emanuelsson et al., 2007). Typically, this involves recognition of the N-terminal secretory signal peptide motif that directs proteins through the classical ER/Golgi-dependent pathway using tools such as SignalP (Petersen et al., 2011). Whilst this is a robust approach for defining a set of potential effector candidates, typically far more candidates are predicted for experimental validation than is feasible. Furthermore, proteins that are predicted to be secreted via a classical pathway might be retained in the ER/Golgi or fulfill roles as part of the cell wall. Therefore, subcellular localization prediction is an important tool that can point toward the functional role or interaction partners of a protein based on its amino acid sequence and can be used to assess if a protein is indeed secreted into the extracellular space (Emanuelsson, 2002). Transmembrane proteins are also commonly predicted and removed from the secretome as these are likely to fulfill functions in the pathogen cell wall. Whilst in silico methods for secretome prediction are under active development and show robust performance, their reported predictive accuracy strongly depends on the selection of the test set and independent benchmarking studies are important for an unbiased tool evaluation. For example, a comprehensive benchmark of secretion prediction tools found that predictive accuracy was in many cases lower than those initially reported by the developers (Klee and Ellis, 2005). Although an evaluation on a large test set covering a wide taxonomic spectrum gives a good indication of a tool's performance, it provides limited insight into its expected performance on a specialized set of proteins, such as effector proteins of fungal and oomycete pathogens.

This study set out to reveal the strengths and weaknesses of existing protein secretion and subcellular localization prediction methods, as applied to the identification of effector proteins produced by fungi or oomycete plant pathogens. Prediction pipelines that have been used in previous studies for defining secretomes and subsequently effector candidates of eukaryotic plant pathogens are diverse and highly parameterized, as exemplified in **Table 1**. For example, SignalP (Nielsen et al., 1997; Nielsen and Krogh, 1998; Bendtsen et al., 2004b; Petersen et al., 2011) or Phobius (Käll et al., 2004) are utilized by the majority of pipelines to extract proteins that are likely to be secreted via a classical pathway. Despite the availability of the latest version of SignalP 4, which was designed to discriminate between signal peptides and N-terminal transmembrane (TM) regions, previous versions (2 and 3) are still frequently used due to their increased sensitivity. Phobius was designed to predict secretion and Nterminal TM domains separately, predicting both the presence of a signal peptide and the number and location of TM helices.

Furthermore, there are also discrepancies in how tools are used and how thresholds for secretion are set (**Table 1**). For example, some studies have used the neural network scores from SignalP 2 and 3 with custom thresholds, whereas others rely on the hidden Markov model probability for predicting the presence of a signal peptide. SignalP 2 and 3 employ predictions from both a neural network (SignalP-NN) and a hidden Markov model (SignalP-HMM), whilst the latest version SignalP 4 is purely based on neural networks. SignalP 2 returns three neural network scores for each position in the sequence: a raw cleavage site score (C-score), the signal peptide score (S-score), and the combined cleavage site score (Y-score). For each sequence, it reports the maximal C-, S-, and Y-scores as well as the mean S-score between the N-terminus and the predicted cleavage site that it used to assess whether a sequence contains a signal peptide. Furthermore, it returns two hidden Markov model scores, the C-score as well as the probability that the sequence contains a signal peptide (S-probability). SignalP 3 replaces the previously used mean Sscore for classification with the D-score, which is calculated as the average of the mean S-score, and the maximal Y-score. It still uses both neural network scores and calculates the signal peptide probability with a hidden Markov model. SignalP 4 is a neural network based method designed to discriminate between signal peptides and transmembrane regions. Prediction of signal peptides is based entirely on the D-score. For all scores, Boolean flags are provided which are either "Y" for a signal peptide or "N" for no signal peptide.

Subcellular localization tools such as TargetP (Emanuelsson et al., 2000), WoLF PSORT (Horton et al., 2007), or ProtComp are frequently used to complement the predictions made by SignalP or Phobius, either through a union or intersection of predictions made by these methods (**Table 1**). This can serve to filter proteins that may be predicted to contain a signal peptide, yet that might not be fully secreted into the extracellular space due to being retained within the ER/Golgi. TargetP predicts if a protein is secreted or localized to the mitochondria, chloroplast, or another unknown location. It reports reliability class scores from 1 to 5, where 1 corresponds to the strongest prediction. Another tool WoLF PSORT, an updated version of PSORT II, has been trained separately on fungi, animal, and plant data. It reports predicted subcellular locations (nuclear, mitochondria, cytosol, cytoskeleton, endoplasmic reticulum, plasma membrane, extracellular, chloroplast, peroxisome, Golgi apparatus, lysosome, and vacuolar membrane) in terms of respective scores based on a weighted k-nearest neighbor classifier. The output format is similar to a sequence similarity search, with scores assigned for each predicted localization site based on the number of nearest neighbors to the query protein. In most studies that employ WoLF PSORT, proteins have been predicted as secreted where extracellular predictions score higher than other locations (**Table 1**). Less commonly, the prediction of non-classically secreted proteins has been reported using SecretomeP, which has been trained on a very small set of verified non-classically secreted proteins derived from mammalian and bacterial sequences (Bendtsen et al., 2004a). Consequently, the relevance of SecretomeP to fungal and oomycete proteins is questionable. Finally, ProtComp is a web-server based tool combining several methods for protein localization, ranging from neural networks to sequence homology searches. Its lack of a publicly distributed version for local installation precludes it from routine use for whole-genome analysis. Predicted transmembrane proteins are typically removed from the set of predicted extracellular proteins using programs such as TMHMM (Krogh et al., 2001) or Phobius. However, most pipelines allow for the presence of one transmembrane domain in the N-terminus, as this can correspond to the signal peptide as both are predicted based on the presence of hydrophobic residues. Additionally, TargetP is often employed to eliminate proteins predicted to be targeted to mitochondria or chloroplasts. In some fungal studies, predicted GPI-anchored proteins are also removed from the set of secreted effector candidates.

The diversity of prediction pipelines shown in **Table 1** illustrates an overall lack of consensus used to predict extracellular pathogen proteins, in particular for effector candidates, and presents difficulties when comparing secretome sizes across different species. Herein we benchmark the performance of individual secretion prediction tools on experimentally verified fungal and oomycete effectors and use the best-performing tools to predict extracellular proteins across fungal and oomycete pathogens. In particular, we show that for cytoplasmic effector proteins that are first secreted into the extracellular space and subsequently translocated to the host cell, protein subcellular localization predictors suffer from poor accuracy. We highlight differences in performance for secretion prediction between fungal effectors and oomycete effectors and conclude by providing practical recommendations for the

#### TABLE 1 | Examples for approaches used in eukaryotic plant pathogen genomic studies that predict secreted proteins.


*(Continued)*


TABLE 1 | Continued

*If provided in the original paper, version numbers of prediction tools are given.*

computational secretion prediction for effector candidate mining from eukaryotic pathogen genomes.

### MATERIALS AND METHODS

Various datasets were chosen for the purpose of comparing the performance of secretion prediction software tools, in the context of plant pathogenomics. Experimentally validated fungal and oomycete effector protein sequences were collected from PHIbase version 3.6 (Urban et al., 2015) and from manual literature searches (Supplementary Data Sheet 1, 2, Supplementary Table 1). For further benchmarking, representative datasets for both extracellular and intracellular proteins of the fungi were obtained by searching SwissProt database records created between 2011 and 2015 for: (1) fungal proteins that have been manually annotated as secreted (taxonomy:"Fungi [4751]" locations:(location:"Secreted [SL-0243]" evidence:manual) created:[20110101 TO 20150101]) (Supplementary Data Sheet 3); and (2) fungal proteins that have been manually annotated as localized to the nucleus (taxonomy:"Fungi [4751]" locations:(location:"Nucleus [SL-0191]" evidence:manual) created:[20110101 TO 20150101]) (Supplementary Data Sheet 4). Sequences that did not start with "M" or were shorter than 30 aas were removed. Both sets only cover proteins for which entries were created after 2011, to avoid an overlap with the training sets used for secretion prediction tools. We could not extract an equivalent set for oomycete proteins from SwissProt due to the very low number of entries for manually curated secreted proteins (four entries). Secretion prediction tools were run on a local machine, or using web servers where indicated, as in **Table 2** (Results given in Supplementary Data Sheet 5). Sensitivity was calculated as TP/(TP + FN) and specificity as TN/(TN + FP), where TP is the number of true positives, TN the number of true negatives, FP the number of false positives and FN the number of false negatives. The Matthews correlation coefficient (MCC) was calculated as <sup>√</sup> TP×TN−FP × FN (TP + FP)(TP + FN)(TN + FP)(TN + FN) .


TABLE 2 | Software tested in this study and the parameters under which proteins were predicted to be secreted.

*All tools were run with default parameters and settings.*

\**Run using web-server.*

### RESULTS AND DISCUSSION

### Signalp 2, 3 and 4 Show the Best Performance for Secretion Prediction on a Set of Fungal Protein Sequences

Several independent benchmark analyses have been published that compare the accuracy of secretion prediction tools. For example, Klee and Ellis (2005) evaluated a range of secretion prediction methods (SignalP 3.0, SignalP 2.0, TargetP 1.01, PrediSi, Phobius, and ProtComp 6.0) on 372 proteins from five vertebrate organisms and found that TargetP, the SignalP 3 maximum S-score and SignalP 3 D-score were the most accurate single scores. Choo et al. (2009) found that most of the tested tools were capable of reliably distinguishing secreted from nonsecreted proteins, as indicated by the high specificities that were achieved. SignalP 4 has been reported by the authors to outperform previous versions of SignalP for a test set spanning eukaryotic and bacterial sequences (Petersen et al., 2011).

Min (2010) evaluated eukaryotic secretion prediction using Phobius, SignalP 3.0, TargetP, and WoLF PSORT individually and in combination with TMHMM and PS-Scan and found that for fungi the most reliable individual predictor of secretion was WoLF PSORT, but a combination of tools produced the most accurate predictions. A follow-up study including SignalP 4.0 reported WoLF PSORT as the best individual tool for fungal data and also made the general recommendation of using SignalP 4.0 over SignalP 3.0 (Melhem et al., 2013). However, the authors assign a protein as predicted to be secreted by WoLF PSORT if it features "extracellular" in the ranked localization list whereas other studies (**Table 1**), including ours, have used this tool quite differently requiring more stringently that the "extracellular" score is higher than that of all other sub-cellular locations. Notably, WoLF PSORT stands out amongst the tools compared in that it has been trained on a relatively extensive set of fungal proteins. However, while it performs well for fungal secreted proteins overall, when restricted to known secreted effectors its performance is markedly poorer.

For the evaluation of secretion prediction performance we utilized two data sets from the SwissProt database: one that contained fungal proteins that were manually annotated as secreted (409 proteins) and the other that contained non-secreted fungal proteins that were manually annotated as nuclear (1113 proteins). We could not extract an equivalent set for oomycete proteins from SwissProt due to the very low number of entries for manually curated secreted or nuclear proteins. All tools tested achieved high specificity in the range of 97.2–99.8%, whereas sensitivity varied more dramatically (**Table 3**). All versions of SignalP, Phobius, and TargetP achieved high sensitivity of more than 94.9%. In contrast, the proportion of proteins that are predicted to be extracellular by WoLF PSORT and ProtComp showed lower sensitivity at 88 and 63.3%, respectively. In terms of the Matthews correlation coefficient (MCC), SignalP 4, SignalP-NN 3 (D-score), and SignalP-NN 2 perform best (MCC = 0.96), with SignalP 2 and 3 showing slightly more sensitivity than SignalP 4, which in turn achieves marginally higher specificity. These results confirm the strong predictive performance of SignalP for secreted fungal proteins.

### Differences in Sensitivity of Secretion Prediction Tools for Effectors from Fungi and Oomycetes

In line with previous studies (Menne et al., 2000; Klee and Ellis, 2005; Choo et al., 2009; Min, 2010), we found that all tools tested achieved high specificity in secretion prediction. For effector prediction in particular, the use of a sensitive method can be desirable to obtain the most complete candidate effector set. To test the sensitivity of secretion prediction tools for effector proteins from eukaryotic plant pathogens, we collected two sets of experimentally verified fungal and oomycete effectors from the literature. In total, the test set of fungal and oomycete effectors contain 69 and 53 proteins, respectively, (Supplementary Table 1). Interestingly, the sensitivity of secretion prediction tools varied between the fungal and oomycete effector sets (**Figure 1**). The neural network predictors of SignalP 3 and SignalP 2


TABLE 3 | Performance of secretion prediction tools applied to secreted fungal proteins sourced from SwissProt.

*Sensitivity, specificity and the Matthews correlation coefficient (MCC) are shown for evaluating the performance of secretion prediction tools. All tools were run with the settings and parameters given in* Table 2*. The best performance in terms of MCC is marked in bold.*

(SignalP-NN 2, SignalP-NN 3) as well as TargetP ("S" for secreted with RC scores ranging from 1 to 5) were found to be the most sensitive for fungal effectors (95.7%). In contrast, the hidden Markov model predictors of SignalP 2 and SignalP 3 (SignalP-HMM 2, SignalP-HMM 3) achieved highest sensitivity for oomycete effectors (98.1%). In general, neural networks and hidden Markov models have different strengths in pattern recognition tasks. Whereas neural networks are powerful for correlating features over a longer range, hidden Markov models are advantageous for modeling sequential regions or patterns found in signal peptides (Nielsen et al., 1999). How this could relate to the prediction of signal peptides in fungal and oomycete effectors remains to be determined.

From the fungal effector set, all secretion predictors, including the best-performing tools SignalP-NN 2, SignalP-NN 3, and TargetP, were consistently unable to predict a signal peptide for only three effectors: Avra10, Avrk1, and Vdlsc1 (**Table 4**). Similarly for the oomycete effector set, all secretion predictors including the best-performing tools SignalP 2-HMM and SignalP-HMM 3 were unable to predict a signal peptide for only a single oomycete effector (Pslsc1). These four effector proteins have been demonstrated to be secreted via non-classical pathways (Ridout et al., 2006; Liu et al., 2014). This suggests that the most sensitive methods are only likely to fail to predict the secretion of non-classically secreted effectors and that using a union of multiple methods would not necessarily improve sensitivity for this test set. At this stage the computational identification of non-classically secreted effectors remains challenging and these types of effectors require experimental validation of their secretion. In the future, an increased understanding of nonclassical secretion mechanisms of fungal and oomycete effectors might lead to improved computational prediction of these effectors. Protein tribe clustering with subsequent examination of high-priority effector candidate families (Saunders et al., 2012) or the presence of conserved protein domains has been effectively applied to identify related effector candidates lacking a predicted signal peptide. However, as the vast majority of fungal effectors share little sequence homology, the utility of this method is limited. Furthermore, orthologs of a secreted protein are not necessarily also secreted (Poppe et al., 2015). Therefore, secretomes predicted through the additional use of reciprocal BLASTs and/or tribe analysis are likely to include a high number of false positives.

TargetP predicted signal peptides with the highest reliability class (RC = 1) for only 63.8% of fungal effectors and for 56.6% of oomycete effectors (**Figure 2**). Without a restriction on the reliability class (RC from 1 to 5), TargetP predicted "secreted" as the localization for 95.6% of the fungal effectors (three effectors were predicted as "unknown"), whereas it returned "secreted" for 92.4% of the oomycete effectors (two effectors were predicted as "unknown" and two were predicted as "mitochondrial"). Therefore, a restriction on the predicted reliability class should not be used for predicting the secretion of effectors and the exclusion of proteins predicted to be localized to mitochondria has to be used with caution for oomycete effectors.

The relatively poor performance of SignalP 4 for oomycete effectors (**Figure 1**, sensitivity 83%) is surprising and suggests that previous versions of SignalP (SignalP 2, SignalP 3) should be used for effector mining in oomycete genomes instead. In particular, SignalP 4 does not predict a signal peptide for six out of seven Crinkler effectors in the test set (**Table 4**; CRN1, CRN2, CRN8, CRN15, CRN16, CRN63, CRN115). Crinkler effectors are a large family of modular proteins that are translocated into host cells, featuring a signal peptide followed by a LXLFLAK sequence motif and C-terminal domains (Haas et al., 2009; Schornack et al., 2010). On the set of seven Crinkler effectors, SignalP 4 achieves the lowest sensitivity, whereas the hidden Markov model predictors of SignalP 2 and SignalP 3 (SignalP-HMM 2, SignalP-HMM 3) correctly predict the signal peptide in all seven Crinklers (**Table 4**). This exemplifies the substantial benefits of using previous versions of SignalP (SignalP 2, SignalP 3) for oomycete effector mining.

Signal peptide prediction tools such as SignalP return the set of proteins that are likely to carry a signal peptide for the classical pathway, but do not necessarily imply that a protein will be extracellular. Many proteins with a signal peptide are retained in various cellular compartments and thus, signal peptide prediction is often combined with additional evidence for extracellular protein secretion, such as the absence of transmembrane domains, GPI anchors or retention signals (**Table 1**). We found that no transmembrane regions outside the signal peptide region (first 60 aas) were predicted for any of the 69 fungal effectors using TMHMM or Phobius. For the 53 oomycete effectors, TMHMM and Phobius both return one transmembrane helix outside the signal peptide region for the RXLR effector PITG\_03192. This might be an indication that TMHMM and Phobius can be used as a preliminary filter to exclude proteins with multiple, non-N-terminal transmembrane domains for effector mining in fungi. However, these tools should

prediction sensitivity are shown for the set of secreted fungal proteins taken from SwissProt as well as the sets of experimentally verified fungal and oomycete effectors.

TABLE 4 | Fungal and oomycete effectors that were not predicted to be secreted by the prediction tools tested.


be used with less stringent requirements for effector prediction in oomycetes.

### Subcellular Localization Prediction Tools Should not be used for Predicting Effector Secretion

Prediction of subcellular localization is important for inferring hints about a protein's function. In eukaryotes, a number of compartments exist to which proteins may be localized, e.g., the extracellular space, mitochondria, chloroplast, nucleus, peroxisome, cytosol or plasma membrane. Several plant pathogenomics studies have used the subcellular localization of "extracellular" as a criterion for predicting secretion, commonly using WoLF PSORT which has been trained separately on fungi, animal and plant data. However, we found that applying WoLF PSORT (fungi) to the sets of experimentally verified fungal and oomycete effectors returned 25 cytoplasmic effectors that are not predicted to be extracellular (34.2% of cytoplasmic effectors, **Figure 3**). This could be explained as follows. First, the estimated sensitivity and specificity of WoLF PSORT is fairly low at around 70% (Horton et al., 2007), which might lead to a high number of false predictions. However, we found that false predictions occurred in particular for non-apoplastic effectors (**Figure 3**). It is possible that WoLF PSORT may have predicted a signal for host cell localization in effectors rather than for the extracellular secretion of the effector from the pathogen cell. Thus, WoLF PSORT should be used with caution when predicting secretomes and its "extracellular" predictions should not be solely relied upon for effector prediction. An alternative approach is to impose a high level of stringency to WoLF PSORT predictions, as was the case for the F. graminearum secretome in which proteins were reported as secreted if the extracellular score was >17 (Brown et al., 2012). Whilst this practice is likely to drastically reduce the number of false positives in the secretome, it is prone to miss bona fide effectors that are not predicted to be

extracellularly localized. In this study, of the oomycete Crinkler effectors CRN1, CRN2, CRN8, CRN15 and CRN16 which are known to localize to the host cell nucleus (Schornack et al., 2010), WoLF PSORT only predicted a nuclear localization for CRN16. Therefore, the predictions of subcellular localization tools may need to be used with caution in effector prediction studies.

### Practical Recommendations for Prediction of Extracellular Proteins in Fungi and Oomycetes

In this study, we have assessed the performance of various secretion and subcellular localization prediction tools, when applied to datasets derived from known fungal and oomycete effectors, as well as extracellular and intracellular fungal proteins. Based on our benchmarking, we deduce recommendations for extracellular protein prediction in fungal and oomycete pathogen genomes.

We observe that previous versions of SignalP (2, 3) demonstrate increased sensitivity over the latest version (4.1) for predicting signal peptides of oomycete effectors, with the HMM-based methods outperforming the NN-based methods. Indeed, this has formed the basis for the pipeline PexFinder (Phytophthora Extracellular Proteins Finder), which automates identification of oomycete extracellular proteins from EST data (Torto et al., 2003). PexFinder uses SignalP 2.0 but applies an additional logical filter that predicts a protein to be secreted only if both the hidden Markov model predicts a signal peptide and the neural network predicts a cleavage site between amino acids 10 and 40. Whilst this pipeline was proposed over a decade ago, it still retains its value for mining effectors from oomycete genomes.

In contrast with oomycete effectors, the NN predictors of SignalP 2 and 3, as well as TargetP, were observed to be the most sensitive for predicting signal peptides of fungal effector proteins. Unlike oomycete effectors, no TM domains were predicted outside the N-terminal signal peptide region using TMHMM or Phobius. Therefore, we propose that for fungal effector mining the requirement of a predicted signal peptide using either SignalP-NN 2 or 3, a TargetP localization prediction of "secreted" or "unknown" (with no restriction on the RC score) and a lack of transmembrane domains outside the signal peptide region (TMHMM/Phobius) would be a robust method. Applying this proposed pipeline to publicly available fungal genomes (some with secretome predictions given in **Table 1**) highlights the wide variability in the number of predicted secreted proteins produced by the different techniques used in previously published studies (**Figure 4**). In line with previous reports, we observe a higher percentage of proteins that are predicted to be secreted in pathogens with a biotrophic phase, compared to necrotrophs and saprophytes (Lowe and Howlett, 2012; Lo Presti et al., 2015). By our method, similar numbers of predicted secreted proteins were predicted across multiple species of the same trophic class, whereas reported numbers were highly variable in genome survey publications for these species (**Figure 4**).

## CONCLUSION

Prediction of effector proteins is of vital importance to the field of plant pathology, and relies heavily on the strengths or weaknesses of secretion prediction software. In this study, we assess the performance of popular software tools against known effectors of both the fungi and oomycetes and offer recommendations on which may be better suited to specialized applications. However, such performance evaluations inevitably vary based on the test data sets used, and therefore, we advise readers to carefully consider the suitability of these recommendations to their own data. Based on the results discussed herein, we recommend the use of the neural network predictors of SignalP 2 or 3, a TargetP localization prediction of "secreted" as well as transmembrane protein removal using either TMHMM or Phobius as a robust choice for predicting the secretion of fungal effectors. In comparison, the hidden Markov model predictors of

SignalP 2 and 3 perform best for predicting the signal peptide of oomycete effectors and automated pipelines such as PexFinder retain their value (Torto et al., 2003). However, the secretome includes many proteins unrelated to pathogenicity, and a number of additional conditions must be subsequently assessed in order to arrive at a subset that represents a potential set of effectors. In oomycetes, this can be achieved using motif enrichment analysis based on RXLR or Crinkler effector families (Petre and Kamoun, 2014), whereas in fungi this process is not feasible and alternative criteria such as small size, an enrichment in cysteines, genomic location, or signatures of diversifying selection can be used (Sperschneider et al., 2015).

While the reliability of secretion prediction is highly relevant to effector prediction, one must not overlook the potential for errors to arise from prior steps involved in the generation of sequence resources. The annotation of gene structure in effector genes can be particularly error-prone for various reasons, stemming from idiosyncrasies related to their genomic context for example their tendency to be associated with repetitive regions of the genome (Raffaele and Kamoun, 2012). There is potential for errors to occur in the assembled genome sequence, especially for those assembled from short-read data only, and the subsequent use of automated annotation pipelines can contribute to inaccurate or fragmented gene predictions. If this occurs in the 5′ region it can lead to misprediction of N-terminal signal peptides. We also note that due to high gene density in fungi that transcript UTRs of adjacent gene loci frequently overlap (Guida et al., 2011; Wang et al., 2014), potentially resulting in gene annotations that are merged products of two or more adjacent loci. Therefore, the use of RNA-seq-based annotation methods specifically designed for fungi (Reid et al., 2014; Testa et al., 2015) can be beneficial to arrive at an optimal set of gene annotations for subsequent secretion prediction.

Our results showed that one of the areas that is currently suffering from poor accuracy is the prediction of subcellular localization for effector proteins that are first secreted from the fungus and then targeted to a host organelle. In particular, we recommend that the requirement of extracellular localization as predicted by WoLF PSORT should not be used for effector mining in secretomes. Re-training subcellular localization tools with updated data sets including experimentally validated effectors might help to improve accuracy. There are few wellstudied fungal effectors with confirmed host-localization, one being the SP7 effector of the arbuscular mycorrhiza Glomus intraradices (Kloppholz et al., 2011). SP7 is initially secreted to the apoplast, then imported into the host cell, and then into its nucleus. This localization is determined by multiple motifs, including a signal peptide, nuclear localization domain and an array of imperfect tandem hydrophilic repeats possibly involved in membrane integration. Both TargetP and WoLF PSORT predicted that the complete version of SP7 was secreted, however after removal of the signal peptide based on SignalP analysis, the TargetP prediction changed to "other" and WoLF PSORT (plant mode) predicted nuclear localization. Intriguingly, this suggests that subcellular localization prediction has the potential to become a powerful tool for providing insight into potential modes of action for candidate effectors based on their organelle targets. Additionally, there are currently no tools designed to predict proteins secreted in a non-classical manner that have been specifically trained on either fungi or oomycetes sequences due to a lack of training data. Although tools like SecretomeP are able to predict some cases (Liu et al., 2014), in the future refined tools for

provided in the literature, previously estimated secretome sizes are indicated with a vertical bar, as given in Table 1. We used the following pipeline for secretome prediction in fungi: SignalP 3.0 *D*-score, a TargetP "secreted" or unknown localization (no restriction on RC score) and no predicted transmembrane domains starting outside the first 60 aas using TMHMM. Genome and secretome size references are given in Table 1, additional genomes used are as follows: *Blumeria graminis* f. sp. *tritici* (Wicker et al., 2013); *Leptosphaeria maculans* (Rouxel et al., 2011); *Magnaporthe oryzae* (Dean et al., 2005); *Botrytis cinerea* (Amselem et al., 2011); *Parastagonospora nodorum* (Hane et al., 2007); *Auricularia subglabra*, *Dichomitus squalens*, *Fomitiporia mediterranea*, *Punctularia strigosozonata*, *Stereum hirsutum*, *Trametes versicolor*, *Coniophora puteana*, *Dacryopinax sp*., *Fomitopsis pinicola*, *Gloeophyllum trabeum*, *Tremella mesenterica*, *Wolfiporia cocos* (Floudas et al., 2012); *Laccaria bicolor* (Martin et al., 2008); *Agaricus bisporus* (Morin et al., 2012); *Aspergillus niger* (Andersen et al., 2011); *Aspergillus oryzae* (Machida et al., 2005); *Coprinus cinereus* (Stajich et al., 2010); *Alternaria brassicicola*, *Cochliobolus heterostrophus*, *Hysterium pulicare* (Ohm et al., 2012); *Neurospora crassa* (Galagan et al., 2003); *Trichoderma reesei* (Martinez et al., 2008); *Agaricus bisporus var. burnettii* (Morin et al., 2012); *Saccharomyces cerevisiae* S288C (Goffeau et al., 1996); *Aspergillus nidulans* (Galagan et al., 2005); *Phanerochaete chrysosporium* (Ohm et al., 2014).

non-classical secretion prediction could be a source of significant improvements in effector prediction.

In summary, whilst existing methods for signal peptide prediction achieve high accuracy, the main areas for improving eukaryotic effector secretion prediction will come from advances in subcellular localization prediction tools as well as from investigations of non-classical secretion pathways and improved gene prediction tools for pathogen genomes.

### AUTHOR CONTRIBUTIONS

JS conceived the study and all authors contributed to the design of the study. JS, AW, and JH acquired, analyzed and interpreted the data. All authors drafted the manuscript and approved the final version.

### FUNDING

JS was partially supported by the Australian Grains Research and Development Corporation.

### ACKNOWLEDGMENTS

We thank Dr. Louise Thatcher and Dr. Ian Dry for their constructive feedback on this work.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 01168

## REFERENCES


genes both locally and ectopically in Saccharomyces cerevisiae. PLoS Genet. 10:e1004021. doi: 10.1371/journal.pgen.1004021


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Sperschneider, Williams, Hane, Singh and Taylor. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Kingdom-Wide Analysis of Fungal Small Secreted Proteins (SSPs) Reveals their Potential Role in Host Association

Ki-Tae Kim1, 2, Jongbum Jeon1, <sup>3</sup> , Jaeyoung Choi 1, 3 †, Kyeongchae Cheong1, 3 , Hyeunjeong Song1, 3, Gobong Choi 1, 3, Seogchan Kang<sup>4</sup> and Yong-Hwan Lee1, 2, 3, 5 \*

#### Edited by:

Marc-Henri Lebrun, Institut National de la Recherche Agronomique, France

#### Reviewed by:

Mahmut Tör, University of Worcester, UK Raffaella Balestrini, Consiglio Nazionale delle Ricerche, Italy Michael R. Thon, University of Salamanca, Spain

> \*Correspondence: Yong-Hwan Lee yonglee@snu.ac.kr

†Present Address:

Jaeyoung Choi, The Samuel Roberts Noble Foundation, Ardmore, OK, USA

#### Specialty section:

This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science

Received: 15 August 2015 Accepted: 03 February 2016 Published: 19 February 2016

#### Citation:

Kim K-T, Jeon J, Choi J, Cheong K, Song H, Choi G, Kang S and Lee Y-H (2016) Kingdom-Wide Analysis of Fungal Small Secreted Proteins (SSPs) Reveals their Potential Role in Host Association. Front. Plant Sci. 7:186. doi: 10.3389/fpls.2016.00186 <sup>1</sup> Fungal Bioinformatics Laboratory, Seoul National University, Seoul, South Korea, <sup>2</sup> Department of Agricultural Biotechnology, Seoul National University, Seoul, South Korea, <sup>3</sup> Interdisciplinary Program in Agricultural Genomics, Seoul National University, Seoul, South Korea, <sup>4</sup> Department of Plant Pathology and Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA, <sup>5</sup> Center for Fungal Genetic Resources, Center for Fungal Pathogenesis, Plant Genomics and Breeding Institute, Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, South Korea

Fungal secretome consists of various functional groups of proteins, many of which participate in nutrient acquisition, self-protection, or manipulation of the environment and neighboring organisms. The least characterized component of the secretome is small secreted proteins (SSPs). Some SSPs have been reported to function as effectors, but most remain to be characterized. The composition of major secretome components, such as carbohydrate-active enzymes, proteases, lipases, and oxidoreductases, appear to reflect the lifestyle and ecological niche of individual species. We hypothesize that many SSPs participate in manipulating plants as effectors. Obligate biotrophs likely encode more and diverse effector-like SSPs to suppress host defense compared to necrotrophs, which generally use cell wall degrading enzymes and phytotoxins to kill hosts. Because different secretome prediction workflows have been used in different studies, available secretome data are difficult to integrate for comprehensive comparative studies to test this hypothesis. In this study, SSPs encoded by 136 fungal species were identified from data archived in Fungal Secretome Database (FSD) via a refined secretome workflow. Subsequently, compositions of SSPs and other secretome components were compared in light of taxa and lifestyles. Those species that are intimately associated with host cells, such as biotrophs and symbionts, usually have higher proportion of species-specific SSPs (SSSPs) than hemibiotrophs and necrotrophs, but the latter groups displayed higher proportions of secreted enzymes. Results from our study established a foundation for functional studies on SSPs and will also help understand genomic changes potentially underpinning different fungal lifestyles.

#### Keywords: fungi, lifestyle, secretome, small secreted proteins, effectors

**Abbreviations:** SSPs, small secreted proteins; SSSPs, species-specific SSPs; CSSPs, conserved SSPs.

### INTRODUCTION

Diverse groups of pathogenic fungi threaten plant health, whereas certain fungi, such as endophytes and mycorrhizal fungi, allow plants to explore new niches, manage biotic and abiotic stresses better, and/or efficiently acquire key nutrients. In both types of plant-fungus interactions, the outcome of interaction is influenced heavily by fungal secretomes, various proteins secreted or injected to plants (Girard et al., 2013). In the secretome, certain small secreted proteins (SSPs) are known to be responsible for disease development as virulence factors or cause resistance (R)-gene mediated defense as avirulence factors (Rep, 2005; Deller et al., 2011; Hacquard et al., 2012). Such SSPs are termed effector proteins and modulate key defense signaling pathways and downstream responses, to attenuate microbeassociated molecular pattern (MAMP) triggered immunity (MTI; Jones and Dangl, 2006). Plants have evolved to activate effector-triggered immunity (ETI) by sensing specific effectors or molecular changes caused by effectors, mainly using the nucleotide binding, leucine-rich repeat class of R-gene products (Dodds and Rathjen, 2010).

Initially, effectors were considered as virulence factors secreted by pathogens (van Esse et al., 2007; Stergiopoulos and de Wit, 2009; Lo Presti et al., 2015). However, it has become apparent that effector-mediated manipulation of MTI is required even for symbiotic associations, because microbial partners also display MAMPs (Zamioudis and Pieterse, 2012; Gourion et al., 2015). Many SSPs have been identified as putative effectors in beneficial plant-associated bacteria (Soto et al., 2006) and mutualistic fungi, such as Glomus intraradices (Kloppholz et al., 2011) and Laccaria bicolor (Plett et al., 2011), expanding the definition of effectors as secreted microbial products that facilitate the establishment of various plant-microbe associations ranging from beneficial to detrimental. Furthermore, SSPs that resemble effector proteins of pathogenic fungi have been identified in saprotrophic fungi, suggesting additional roles of SSPs (Rovenich et al., 2014; Seidl et al., 2015). Driven by the discovery of diverse putative effector proteins in fungi representing different lifestyles, several studies have analyzed the repertoires of putative secreted proteins encoded by various fungi and the potential relationship between their secretomes and lifestyles (Lowe and Howlett, 2012; Krijger et al., 2014; Meinken et al., 2014; Lo Presti et al., 2015). Analysis of the size of secretome relative to the total proteome in 48 fungal species by Lowe and Howlett (2012) suggested its potential relationship with lifestyles. Another comparative study by Meinken et al. (2014) proposed that the secretome prediction of previous study may be overestimated because only SignalP was used for the prediction, but they drew the same conclusion. However, these studies did not consider individual components of the secretome. The study by Krijger et al. (2014) suggested that phylogenetic position strongly influenced both the secretome size and its composition by analyzing 33 fungal species but did not include major secreted enzyme groups. In addition those displaying different modes of pathogenesis (biotroph, hemibiotroph, and necrotroph) were combined as a single lifestyle in the last two analyses. Lastly, the review on fungal effector proteins by Lo Presti et al. (2015) only considered plant cell wall degrading enzymes in order to mine putative effector proteins. The secretome contains not only effector proteins but also groups of enzymes involved in the breakdown of cell walls, self-protection or nutrient acquisition, such as carbohydrateactive enzymes (CAZymes), oxidoreductases, proteases, and lipases (Girard et al., 2013). Not surprisingly, biotrophs encode fewer CAZymes than hemibiotrophs and necrotrophs (Zhao et al., 2014). To investigate whether the composition and size of putative effectors correlates with different lifestyles, such enzymes should also be analyzed separately.

A wide range of validated and suspected protein effectors encoded by bacteria and oomycetes have been identified, which was facilitated by the conserved delivery machinery to plant cells (Cornelis and Van Gijsegem, 2000) and sequence motifs present in effectors (Whisson et al., 2007), respectively. Although, a conserved IGY motif has been identified in a novel SSP family of Dikarya fungi (Cheng et al., 2014), known fungal effectors do not show conserved features, hampering their identification (Rafiqi et al., 2012; Giraldo and Valent, 2013). One or more of the following features have been used to predict candidate effector proteins in fungal secretomes: (a) presence of the signal peptide, but no transmembrane domain or GPI-anchor sites; (b) small sized proteins (usually fewer than 300 amino acids) that are present only in specific species or isolates; (c) expression in planta or during infection; (d) rich in cysteine residues; and (e) presence of a conserved motif within effector candidates like in oomycete and fungal effectors (Birch et al., 2008; Godfrey et al., 2010; Zuccaro et al., 2011; Cheng et al., 2014). Using these features candidate effector proteins have been identified in three types of plant pathogenic fungi: biotrophs such as rusts (Duplessis et al., 2011), smuts (Schirawski et al., 2010), and Blumeria graminis (Spanu et al., 2010), hemibiotrophs including Verticillium dahliae (Santhanam and Thomma, 2013) and Magnaporthe oryzae (Kim et al., 2010), and necrotrophs including Fusarium graminearum (Brown et al., 2012) and Sclerotinia sclerotiorum (Guyon et al., 2014). However, direct comparisons between these fungi are hampered because different bioinformatics approaches and criteria have been used for the prediction of effector proteins. The lack of robust pipelines that can be applied to mine candidate effector proteins from rapidly increasing genome sequences of phylogenetically diverse fungi and limited in planta expression data for the genes encoding SSPs also have hampered large-scale comparative analyses of putative effectors.

In this study, we refined multiple secretome components with the focus on SSPs for 136 fungal species archived in Fungal Secretome Database (FSD; Choi et al., 2010) via a data extraction pipeline consisting of multiple programs. This refined data set was analyzed in the context of the phylogenetic position and lifestyle of individual species. Secreted enzymes that likely play important roles in colonizing host plants, including CAZymes, oxidoreductases, which are likely secreted for protection against host-produced reactive oxygen species (Chi et al., 2009), and lipases and proteases, which participate in nutrient acquisition and manipulation of host defense, were also compared. Resulting data helped determine which secretome components might function as major lifestyle determinants. In addition, we mined candidate effector proteins for functional validation and also showed the pattern of evolutionary changes associated with several known effector proteins.

### MATERIALS AND METHODS

### Phylogenetic Analysis

The phylogenetic trees of 136 fungal species shown in Supplementary Figure S2 was constructed using CVTree v4.2.1 with k-tuple 7 (Xu and Hao, 2009). The tree only shows topology and ectopically positioned Taphrina deformans was manually curated based on NCBI Taxonomy. The lifestyle of each fungus was annotated based on literature review.

### Secretome Data Collection, Refinement, and Annotation

The SP, SP<sup>3</sup> and SL classes of secretory proteins, which include proteins carrying a classical signal peptide, were downloaded from FSD (Choi et al., 2010). Hence, only the proteins secreted via canonical pathway were considered in this analysis. In order to predict SSPs, a two-step mining pipeline was employed. The first step, adopted from Brown et al. (2012), involves refining the secretome by selecting proteins predicted to be secreted by both ProtComp v9.0 (detected as secreted) and WoLF PSORT v0.2 (extr => 10; Horton et al., 2007), which are protein localization prediction programs trained with fungal data. Proteins that may be secreted but probably membrane bound were filtered out using Phobius v1.01 (TM = 1; SP = N; Käll et al., 2004), a program that detects signal peptides and transmembrane helixes, and UTProt (GPI-anchored = Y), a fungal specific GPI-anchor prediction tool (Cao et al., 2009). These programs were run on local Linux computers and the parameter settings were determined with various fungal effector proteins listed in the review by Stergiopoulos and de Wit (2009). The second step was grouping proteins within individual refined secretomes based on their predicted functions. To identify CAZymes, relevant HMM profiles from dbCAN release 3.0 (Yin et al., 2012) were employed. Oxidoreductases, lipases, and proteases were identified using BLASTP (E-value cutoff of 0.001) with individual refined secretomes as queries against BLAST databases of these enzyme sets. Fungal oxidoreductases were downloaded from Fungal Peroxidase Database (Choi et al., 2014). Sources for the lipase and protease datasets were the Lipase Engineering Database (Fischer and Pleiss, 2003) and MEROPS (Rawlings et al., 2014), respectively.

### Mining and Annotation of SSPs

Protein length and cysteine content were analyzed using inhouse Python scripts, and putative species-specific proteins were identified by running BLASTP against all other species, followed by BLASTP against NR database excluding itself to reduce false positives caused by the limited phylogenetic coverage of certain taxa. Both species-specific SSPs (SSSPs) and conserved SSPs (CSSPs) were annotated using pre-computed InterPro terms, which were retrieved from Comparative Fungal Genomics Platform 2.0 (CFGP 2.0; Choi et al., 2013), and mapped to PHI-base effector proteins using BLASTP (Urban et al., 2015).

### Clustering of CSSPs and Molecular Evolutionary Analysis

All by all BLASTP analyses were performed with CSSPs and PHI-base effector proteins prior to Markov Cluster Algorithm (MCL) clustering. Resulting data were clustered via MCL with the inflation option 1.4 for high granularity (Enright et al., 2002), which produced the least number of singletons. The predicted protein families containing PHI-base effectors were analyzed further. Proteome data from CFGP 2.0 were used to construct species trees, and proteins in each of the analyzed families were used to construct gene trees. CVtree v4.2.1 was used for fungal species tree with k-tuple 7 (Xu and Hao, 2009), ClustalW in MEGA 6.06 was used for alignment of proteins, and maximum-likelihood gene trees were constructed using the default setting (Hall, 2013). After reconciling the species and gene trees using Notung 2.6 (Chen et al., 2000), potential gene duplication and loss events were annotated.

## RESULTS

### Analyzed Species Cover Diverse Taxa and Lifestyles

Genome sequences of the 136 fungal species used in this study (Supplementary Table S1) are publicly available. The taxa covered include Microsporidia, Zygomycota, Glomeromycota, Ascomycota, and Basidiomycota (**Table 1**). The lifestyles represented include animal pathogens, biotrophic, hemibiotrophic and necrotrophic plant pathogens, symbionts, and saprotrophs (**Table 1**). The necrotrophs were further divided into crop-infecting and wood-decaying types. The symbionts are fungi associated with plants and resulting beneficial/mutualistic effects in the interactions. These include both ecto- and arbuscular mycorrhizal fungi, endophytes, and a plant growth promoting fungus with symbiotic activity (Vargas et al., 2009).

TABLE 1 | Taxonomic distribution and lifestyle of the species analyzed in this study.


C-necrotroph, Crop-infecting necrotroph; W-necrotroph, Wood-decaying necrotroph.

## Refined Fungal Secretomes Show High Degree of Size Variance

Since, fungal effectors are expected to be secreted into the host apoplastic space or cytoplasm, mining fungal secretomes is the first step for their identification. Secretome data for most sequenced fungi have already been archived in FSD (Choi et al., 2010), an online platform that was built to identify and archive secreted proteins via six different programs detecting extracellular proteins and a trans-membrane helix detection program. The predicted secretomes were then categorized into three different classes. To predict bona fide SSPs, the total secretome data from FSD were further refined via additional filtrations (**Figure 1**). Putatively membrane-bound proteins were eliminated using additional programs that detect transmembrane-helixes and GPI-anchors (see Materials and Methods).

After this refinement, the size of secretome was reduced by 33.7% on average compared to that predicted using only SignalP 3.0 and by 70.9% compared to the total secretome predicted by the pipeline used for building FSD (Supplementary Figure S1). The number of proteins in refined secretomes ranged from 19 (Pneumocystis jirovecii, an opportunistic human pathogen) to 1940 (Auricularia subglabra, a wood-decaying necrotroph; Supplementary Table S1). The refined secretome accounted for 5.5% of the total proteome on average, with the lowest being 0.5% (P. jirovecii) and the highest being 11.0% (Magnaporthe oryzae, a hemibiotrophic plant pathogen). Both P. jirovecii and M. oryzae belong to the phylum Ascomycota, illustrating high degrees of variance within individual phyla (Supplementary Figure S1).

### Patterns Observed Among Refined Secretomes in the Context of Phylogenetic Positions and Lifestyles

Several general patterns associated with the size of refined secretome were observed (**Figures 2A,B**). On average, fungi belong to Pucciniomycotina encode the largest refined secretomes, whereas Microsporidia code for the smallest ones (**Figure 2C**). However, the size of refined secretome also varied widely within individual taxa as illustrated in Supplementary Figure S2. For example, in Ascomycota, the species belong to Pezizomycotina have much larger secretomes than those of Saccharomycotina and Taphrinomycotina, and in Basidiomycotia, the species belong to Pucciniomycotina have larger secretomes than those of Ustilaginomycotina.

We analyzed the composition of refined secretomes among groups that represent different lifestyles to investigate their relationship. On average, the size of refined secretome for plant-associated species was larger than those of saprotrophs and animal pathogens (**Figure 3A**). Among the plant-associated species, pathogens encode larger refined secretomes than symbionts. Among the pathogens, cropinfecting necrotrophs code for the largest refined secretome, followed by hemibiotrophs, wood-decaying necrotrophs, and biotrophs. Proportions of CAZymes, proteases, lipases, and oxidoreductases in 136 species were analyzed in relation to their lifestyles (**Figure 3B**; Supplementary Table S1). Although

FIGURE 1 | Pipeline used to refine secretomes and mine small secreted proteins (SSPs). Chosen secretomes were downloaded from Fungal Secretome Database (FSD) and refined. The refined secretomes were then divided into four classes of enzymes, including CAZymes, proteases, lipases, and oxidoreductases, and proteins of undefined function. To predict functions of the latter group of proteins, InterPro analysis was performed, and the presence of signatures frequently associated with effectors, including short length and taxon-specific distribution, was also analyzed. To reduce false species-specific proteins due to the limited phylogenetic coverage of certain taxa, BLASTP against NCBI NR database was performed. SSPs were divided into species-specific SSPs (SSSPs) and conserved SSPs (CSSPs) (see Materials and Methods).

these enzymes facilitate nutrient acquisition and defense against reactive oxygen species from host (Rogers et al., 1994; Sreedhar et al., 1999; Chi et al., 2009; Blümke et al., 2014), they were not considered for mining effector-like SSPs as in Rep (2005). On average, 46.6% of the refined secretome corresponded to these enzymes. Biotrophs display the lowest proportion for all four types of enzymes, and the highest proportion for all types, except oxidoreductases, was observed in crop-infecting necrotrophs. Wood-decaying necrotrophs exhibit the highest proportion of oxidoreductases. Animal pathogens have higher proportion of proteases than saprotrophs, symbionts, and plant pathogens except necrotrophs, suggesting the importance of proteases in animal pathogenesis and necrosis of plants.

### Patterns Associated with Effector-Like SSPs

Most effector-like SSPs belong to the proteins of other functions. Accordingly, we first removed the four groups of enzymes from the refined secretomes before identifying SSPs (**Figure 1**). Subsequently, three features, including short length (≤300 aa), species-specific distribution pattern, and cysteine enrichment, were used to identify effector-like SSPs in the proteins of other functions.

Proteins shorter than 300 aa are abundant in all species (**Figure 4A**) with the exception of four saprophytic fungi, including Schizosaccharomyces pombe, Pichia pastoris, Spathaspora passalidarum, and Ophiostoma piceae (Supplementary Figure S3). In general, biotrophs have the most abundant SSPs. On other hands, proportions of species-specific proteins in the proteins of other functions varied between individual species (Supplementary Figure S4). In general, species that are intimately associated with living plant tissues, such as biotrophs and symbionts, have greater numbers of speciesspecific proteins than those with different lifestyles (**Figure 4B**). This suggests that species-specific presence of effectors probably arose via co-evolution with the hosts (Stergiopoulos et al., 2012). The species-specific proteins accounted for 25–50% of the proteins of other functions in 27 species and for over 50% in six species (**Table 2**). Many wood-decaying necrotrophs have large proportions (but not exceeding 50%) of species-specific secreted proteins. Proportions among symbionts typically ranged from 25 to 50%, but it is over 50% in Rhizophagus irregularis, which is the only Glomeromycota symbiont (Tisserant et al., 2012, 2013). Among the animal pathogens, three species belonging to Microsporidia (Enterocytozoon bieneusi, Nosema ceranae, and Antonospora locustae) and one Ascomycota species (P. jirovecii) displayed large proportions (>50%; **Table 2**).

The proteins of other functions and the four enzymes in the refined secretomes were classified into three groups based on their cysteine content, including 0.0≤ − <3.0%, 3.0≤ − <5.0%, and ≥5.0%, and average percentages of these groups in four lifestyles were compared (**Figure 4C**). Proteins with 3% or more cysteine were considered cysteine-rich in earlier studies (Stergiopoulos and de Wit, 2009; Saunders et al., 2012). However, a more stringent criterion, over 5%, was also used (Brown et al., 2012; Krijger et al., 2014). The general trend was that both classes of cysteine-rich proteins were more abundant among the proteins of other functions than the enzymes regardless of any lifestyles.

8275 SSSPs were found from 133 species. The fasta formatted SSSP sequences available in Supplementary Data Sheet 1. Three species, including two animal pathogens (Malassezia sympodialis and Nematocida parisii) and a saprotroph (Kluyveromyces lactis), did not have any SSSPs. The remaining SSPs were encoded by at least two species and were termed conserved SSPs (CSSPs), which may correspond to general fungal proteins or elicitors.

## Functional Annotation of SSPs using Interpro Terms and the Known Effector Proteins Rarely Reveals Their Functions

InterPro domain analysis was performed to predict potential functions of both SSSPs and CSSPs. Most SSPs, 7960 out of 8275 (96.2%), displayed no defined InterPro terms with only 315 SSPs being annotated with 97 different terms (Supplementary Table S2). Among the annotated SSSPs, the most commonly found InterPro term is membrane insertase YidC (IPR019998), which was found in 145 proteins. Other terms that are potentially related to pathogenicity and were found at least twice include IPR008427 (extracellular membrane protein, CFEM domain), IPR003172 (MD-2-related lipid-recognition domain), and IPR016191 (ribonuclease/ribotoxin).

Since the InterPro terms did not suggest any specific functions in association with pathogenicity, 49 known effector proteins from 11 fungi were retrieved from the PHI-base (Urban et al., 2015) and mapped to SSSPs (Supplementary Table S3). Nine effector proteins were mapped to four species with a bit score >50 and e-value <1.e-3 (**Table 3**). Except MGG\_10556T0 with a C2H2-type zinc finger domain, which resembles M. oryzae avirulence factor AVR-Pii, the others contained no known domains.

CSSPs were clustered with 49 PHI-base effectors via MCL clustering analysis, and 2786 families containing 19,342 proteins were identified. Among them, 13 families contained at least one PHI-base effector (**Table 4**). Of these families, only one is associated with InterPro term related to fungal pathogenicity. For example, proteins in the family containing M. oryzae effector MgSM1 carry a well-known domain (cerato-platanin, IPR010829; Chen et al., 2013). In total, 22 out of 49 PHI-base effectors were either matched to a single protein or clustered within a protein family (Supplementary Table S3). The small number of matches may be due to the strain-specific presence of many effectors.

### Genomic Contexts of SSSP-Coding Genes

The genomic regions containing the known host-specific virulence genes have been shown to have sparsely distributed genes and AT-rich (Schmidt and Panstruga, 2011). We examined the genomic contexts of SSSP-coding genes in 59 species with number of contigs less than 500 (Supplementary Table S4). The average number of genes and the mean AT-content in each of the 100 kb segments of their genomes were calculated as references. The corresponding data for each the 100 kb windows containing SSSP-coding gene(s) were compared with the reference data.

The gene density around SSSP-coding genes was too variable to establish a clear trend in the context of lifestyles.

Effector-like SSPs were divided into two classes. Proteins of 300 aa or shorter that appeared to be encoded only by one species were considered species-specific SSPs (SSSPs); a total of

secreted proteins (SSPs). C-necrotroph and W-necrotroph correspond to crop-infecting necrotroph and wood-decaying necrotroph, respectively. (B) Distribution patterns of species-specific and conserved proteins in individual lifestyles are shown. (C) Distribution of cysteine content in the four groups of enzymes (E) and the proteins of other functions (OF). Proteins with the cysteine content being equal or greater than 3% of the aa resides are considered as

cysteine-rich.


#### TABLE 2 | List of species with high proportion of species-specific secreted proteins in the proteins of other functions.

C-necrotroph, Crop-infecting necrotroph; W-necrotroph, Wood-decaying necrotroph.

#### TABLE 3 | Species-specific effector proteins found among SSSPs.


#Locus ID of SSSPs matched to effectors from PHI-base by BLASTP.

IPR007087: Zinc finger, C2H2-type, IPR015880: Zinc finger, C2H2-like.

However, many SSSP-coding genes in a mycorrhizal symbiont L. bicolor, a plant growth promoting fungus Trichoderma virens and most wood-decaying necrotrophs were often located in regions with low gene density (Supplementary Table S4). Overall, the SSSP-coding genes did not appear to be concentrated within specific genomic region(s). On the other hand, the AT-content around most SSSP-coding genes was clearly lower than the total AT-content of the genomes of these species, except that two animal pathogens (Cryptococcus neoformans and P. jirovecii), one biotroph (Mixia osmundae), and one saprotroph (Wallemia sebi) had higher AT-contents at 52, 72, 44, and 60%, respectively (Supplementary Table S4).

### The Number of SSSPs Correlates with the Proteome Size, Lifestyle, and Taxonomic Position

Several studies have reported that the size of fungal secretome correlates with lifestyle (Lowe and Howlett, 2012; Meinken et al., 2014; Lo Presti et al., 2015) and that even stronger correlation exists with phylogenetic position (Krijger et al., 2014). The former studies reported that animal pathogens and saprotrophs have similarly-sized secretomes, but their secretomes are smaller than those in plant pathogens. We assessed the size of SSSPs in individual species to determine whether similar patterns exist. The relationships between the predicted proteomes and refined secretome components are shown in Supplementary Figure S5. The pattern was similar to that observed in a previous study (Lowe and Howlett, 2012). However, the relationship between SSSPs and the total proteome was markedly different (**Figure 5A**). Biotrophs, symbionts, and some hemibiotrophs usually have larger numbers of SSSPs than animal pathogens, saprotrophs, and necrotrophs. The larger numbers of SSSPs in the former group, which are intimately associated with plants, support the hypothesis that these proteins are important for manipulating plant hosts.

The number of SSSPs within species ranged from 0 to 466. Although not all plant-associated fungi have higher numbers of SSSPs than saprotrophs and animal pathogens (**Figure 5B**),


#Species which the PHI-base effectors originated. IPR009009: Barwin-related endoglucanase, IPR010829: Cerato-platanin, IPR017853: Glycoside hydrolase, superfamily, IPR001509: NAD-dependent epimerase/dehydratase and IPR016040: NAD(P)-binding domain.

in general, animal pathogens and saprotrophs had fewer SSSPs compared to plant pathogens. Within plant pathogens, numbers of SSSPs encoded by crop-infecting necrotrophs are similar to those encoded by saprotrophs, and biotrophs generally have larger numbers than these groups. Within biotrophs, four Pucciniomycotina species (Puccinia graminis, P. striiformis, Melampsora laricis-populina, and M. osmundae) and B. graminis (Ascomycota) have the highest numbers. In contrast, two species in Ustilagomycotina (Ustilago maydis and Sporisorium reilianum) and three Ascomycota species (Cladosporium fulvum, T. deformans, and Ashbya gossypii) encode small numbers of SSSPs, similar to those in saprotrophs. In addition, wooddecaying necrotrophs and hemibiotrophs have similar numbers of SSSPs.

The range of SSSPs numbers was also analyzed with regard to taxonomic positions (**Figure 5C**). The mean number within Basidiomycota is 100, which is higher than the mean in Ascomycota (43). Within Basidiomycota, Pucciniomycotina encodes the most SSSPs. Within Ustilaginomycotina, two biotrophs (U. maydis and S. reilianum) contained greater numbers of SSSPs compared to other members. In Ascomycota, the range was widest in Pezizomycotina with many outliers being present at both sides of the mean. Three hemibiotrophs, including M. oryzae, Leptosphaeria maculans, and Mycosphaerella graminicola, one nectrotroph Stagonospora nodorum, one biotroph B. graminis, and one saprotroph Pyronema confluens encode much larger SSSPs than the mean. Saccharomycotina and Taphrinomycotina encode noticeably smaller SSSPs than Pezizomycotina.

The biotrophs in Pezizomycotina code for relatively large numbers of SSSPs. Similarly, T. deformans, the only biotroph in Taphrinomycotina, also shows a high number of SSSPs. However, A. gossypii, the only biotroph in Saccharomycotina, has a lower

FIGURE 5 | Number of SSSPs in relation to the lifestyles and taxonomic positions of the analyzed species. (A) The number of SSSPs and the size of predicted total proteome for individual species are shown. (B) The number of SSSPs among species representing different lifestyles. (C) The number of SSSPs encoded by the species in different taxa. The subphyla Glomeromycota and Zygomycota, represented by only one species each, are not included. Pez, Pezizomycotina; Sac, Saccharomycotina; Tap, Taphrinomycotina; Aga, Agaricomycotina; Puc, Pucciniomycotina; Ust, Ustilaginomycotina. C-necrotroph and W-necrotroph correspond to Crop-infecting necrotroph and Wood-decaying necrotroph, respectively.

number and proportion of SSSPs compared to other members (**Figure 5C**).

## Evolution of Known Effector Proteins Belonging to CSSP Families Suggests Their Other Roles

We predicted 13 families of CSSPs containing PHI-base effectors (**Table 4**). Among them, families containing M. oryzae effector MgSM1 and the B. graminis effectors were found in the most taxa, and the evolution of these families of CSSPs was analyzed. Genes encoding proteins with a cerato-platanin domain (IPR009009) were conserved in 91 species, all belonging to Pezizomycotina or Agricomycotina (**Figure 6**). In total, 23 gene duplications and 49 gene losses were observed in the family, but the gene duplications and losses were skewed toward Agaricomycotina, which consisted mostly of wooddecaying necrotrophs. However, these genes were not found in Ustilaginomycotina and Pucciniomycotina, the other subphyla of Basidiomycota that mostly consisted of biotrophic species. The genes encoding proteins that carry the ancestral ceratoplatanin domain have undergone at least five duplication events, but many of the genes also have been lost in multiple lineages. As a consequence, 33 species in Dothideomycetes and Eurotiomycetidae contained only one such gene. However, copies in the necrotrophic Sordariomycetes, Botrytis cinerea and Fusarium spp. have undergone duplication events. The wide distribution of the genes encoding cerato-platanin domain

proteins in species with pathogenic lifestyles suggested that their products play an important role in pathogenesis. However, numbers of this group were also found in 4 symbionts, 2 nematophagous fungi, and 19 saprotrophs, suggesting other roles associated with them. The gene family identified using the B. graminis effector candidate BEC1040 has gone through 16 duplications and 12 losses, and its numbers were present in 50 species in Ascomycota and Punctularia strigosozonata in Basidiomycota (Supplementary Figure S6A). Only one duplication event occurred in the family identified with BEC1019 (Supplementary Figure S6B), but both duplication and loss events occurred in the family identified with BEC1005 (Supplementary Figure S6C). Overall, many CSSPs that resemble known effectors were identified in non-plant pathogens, raising the possibility that they are remnants of degenerated genes or play roles other than facilitating plant infection.

### DISCUSSION

Rapid progresses in sequencing fungal genomes, in combination with various "omics" tools, have facilitated large-scale comparative genomic analyses to uncover the genetic and evolutionary basis of various traits or functions of fundamental and practical significance. In this study, we developed a pipeline for mining SSPs as effector candidates from fungi with different lifestyles and taxonomic positions in order to conduct their kingdom-wide comparative analysis. It has been commonly hypothesized that biotrophs and symbionts secrete more effectors than necrotrophs, as biotrophic associations require the modulation of the host defense system to keep host cells alive for nutrient acquisition while preventing the launch of strong defense responses. Since necrotrophs utilize CAZymes and toxins to kill host cells to obtain nutrient, such manipulations of host defense likely play less critical roles. Because only a small number of fungal proteins in selected plant pathogens have been identified as effectors, we tested this hypothesis by comparing SSSPs as effector candidates.

We refined secretomes to identify SSSPs as previously reported in analyzing the secretome of F. graminearum (Brown et al., 2012). In addition, we mined and compared other components within the refined secretomes to investigate any lifestyle-associated genomic adaptations. Three previous studies examined potential relationships between the secretome and lifestyle (Lowe and Howlett, 2012; Krijger et al., 2014; Lo Presti et al., 2015). However, Lowe and Howlett (2012) and Lo Presti et al. (2015) used only one signal peptide prediction program for mining secretomes and Krijger et al. (2014) did not eliminate putative membrane-bound proteins, which likely inflated the size of secretomes. Our refined secretome pipeline more rigorously identified secretory proteins via two additional protein localization detection programs and the elimination of transmembrane and GPI-anchor proteins. For comparison, the number of SSSPs and the effector candidates in powdery mildew predicted by Spanu et al. (2010) is the same, and the refined secretomes predicted in both corn pathogens S. reilianum and U. maydis are similar to those reported by Schirawski et al. (2010). The corn pathogen study found that many effector candidate genes from both pathogens were orthologous, consequently the number of SSSPs for them were drastically reduced in our study. When the refined secretome of F. graminearum was compared to the data by Brown et al. (2012), their 539 out of 574 proteins were included in our 961 secretome. Although we predicted larger secretome than their refined secretome, this is due to the lowered parameter settings for prediction as our parameters were determined based on the effectors listed in Stergiopoulos and de Wit (2009) for kingdom-wide analysis. We also used a greater number of species to cover more diverse taxa and further divided plant pathogens into four groups to perform comprehensive analyses.

The roles of fungal effector proteins regarding lifestyles are previously discussed by Lowe and Howlett (2012) and Lo Presti et al. (2015). These studies suggested that fungi with same lifestyles have similar secretome proportions. The secreted CAZymes also showed similar pattern that biotrophs encode fewer CAZymes than hemibiotrophs and necrotrophs (Zhao et al., 2014; Lo Presti et al., 2015). Although our overall conclusion on refined secretome and CAZymes may be similar with the previous studies, the numbers of SSSPs show lifestyle adaptation, different from secretomes and CAZymes analyses. In addition, secretomes contain not only CAZymes for cell wall degradation and utilization of its components as nutrients but also proteases, lipases, and oxidoreductases for breakdown of other macromolecules, and self-protection and/or pathogenesis. For example, AVR-pita of M. oryzae is a zinc metalloprotease that acts as an avirulence factor in its host rice (Zhang and Xu, 2014). The lipase effector FGL1 in F. graminearum suppresses callose formation in wheat and is required for host infection (Blümke et al., 2014). Oxidoreductases have been investigated in phytopathogenic fungi for their roles in scavenging plant reactive oxygen species and pathogenicity (Chi et al., 2009). Since their importance in pathogenesis has been established or suggested, we examined the proportions of these enzymes relative to the whole proteome in light of the lifestyle and taxonomic position of individual species. Proteases, lipases, and oxidoreductases seem to be more abundant in plant pathogens, especially in hemibiotrophs and necrotrophs, than the species with different lifestyles. Since the majority of known fungal effector proteins do not possess enzymatic activity, we excluded the above enzyme sets prior to mining SSPs, which include short proteins (≤300 aa) with a signal peptide, but no transmembrane domain or GPIanchor. Species-specific presence was also analyzed to classify SSPs into SSSPs, which likely act as host-specific effectors. Overall, we found that the size of refined secretomes and SSSPs varies widely between species, but some patterns associated with lifestyles.

In general, phytopathogenic fungi tend to have larger secretomes than non-pathogens. Although, Lowe and Howlett (2012) suggested that animal pathogens generally have a lower proportion of secretome, the sizes of secretomes and proteins of other functions in certain animal pathogens, such as nematophagus and entomopathogenic fungi, were similar to some necrotrophic plant pathogens. This is not too surprising considering that nematophagus and entomopathogenic fungi secrete diverse proteins to facilitate infection and consumption of hosts (Andersson et al., 2013; Staats et al., 2014). The hostspecific animal pathogens that co-evolved with hosts for longer

period of time may have large secretome, yet smaller than those of phytopathogens. Although symbionts intimately interact with plant hosts, their secretomes are smaller than those of pathogens, a pattern that was found in previous studies on symbionts such as L. bicolor (Martin et al., 2008) and Tuber melanosporum (Martin et al., 2010). This is due to the reduced number of CAZymes compared to necrotrophs and hemibiotrophs. Biotrophs encode smaller sets of CAZymes, but larger secretomes than non-pathogens mainly due to their abundant SSSPs. This reflects the lifestyle of biotrophs which causes minimal damages to the hosts to maintain a long-term feeding relationship. Not surprisingly, reduced numbers of SSSPs are observed among crop-infecting necrotrophs. However, wood-decaying necrotrophs conspicuously possess a similar level of SSSPs to hemibiotrophs. Their roles of SSSPs in wood-decay remain unclear. Within saprotrophs, fruit-body forming fungi and fungi displaying antifungal activities encode greater numbers of SSSPs than yeasts and extremophiles. However, the number of SSSPs has no correlation with host range within necrotrophs, types of rot for wood-decaying necrotrophs, or types of association for symbionts. As illustrated by Neurospora crassa, a saprotrophic fungus that has been reported to be associated with pine trees in harsh conditions and even pathogenic (Kuo et al., 2014), fungal lifestyles may not be a fixed attribute, but is changeable depending on environmental conditions.

Another concern is that the number of small proteins could be affected by the minimum protein size used to annotate each genome. However, we strictly used the annotated data associated with published genomes, thus we can assume that the numbers of annotated proteins were comparable to each other. In addition, the majority of genome data were also from JGI and Broad Institute, which followed the conventional JGI annotation process and Broad Gene Finding Methods, respectively. If the minimum cutoff length for gene prediction was stated, only short peptides without EST support were eliminated. Since, the majority of fungal genome studies did not show the cutoff size for protein-coding gene prediction, we analyzed and compared the length distribution of the annotated proteomes for validation (the last column of Supplementary Table S1). In result, no parameters were possibly used for 50 species and additional 72 species with the cutoff of 30 aa. The rest 14 species contained proteins with length at least 50 aa. The number of short proteins in the first category was extremely high for a few species such as C. fulvum and P. confluens. However, only the proteins secreted with canonical pathways were considered in this analysis, which means the minimum size of protein is bound to the length of signal peptide that is 15 to 40 aa (Choo et al., 2005). Although many short proteins were annotated using no or very short cutoff, they were eliminated if the signal peptide was absent. Overall, we believe that the numbers of proteins are generally comparable like in other secretome studies.

Many genes coding candidate effectors in plant pathogens and virulence-associated genes of animal pathogens have been reported to reside in gene-sparse, AT-rich, and telomere-proximal genomic regions (Schmidt and Panstruga, 2011). However, genomic distribution patterns of SSSP-coding genes did not display similar trends, with the exception of those in the wood-decaying necrotrophs. Most SSSP-coding genes in these fungi were found in genomic regions with low gene numbers. A previous study reported that 23% of the proteins encoded by these wood-decaying fungal species are unique (Riley et al., 2014), indicating that the SSSP-coding genes recently arose from non-coding sequences as suggested by Carvunis Model (Carvunis et al., 2012). However, there are still many SSSP-coding genes located in AT-rich regions. Therefore, the genomic context could be considered for prioritizing validation of effector functions. Although effector proteins are often thought to be species-specific due to co-evolution with the host, there are cases of conserved effectors within related species. For example, the C. fulvum effector Ecp2 (Stergiopoulos et al., 2012) and the cerato-platanin proteins (Chen et al., 2013) are conserved only within Ascomycota and Basidiomycota. Moreover, the clustering analysis of CSSPs showed that none of them seem to be conserved across the fungal kingdom, indicating that they are typically limited to specific genera. However, some CSSPs of B. graminis, e.g., BEC1040, BEC1019, and BEC1005, were found in fungi having different lifestyles other than biotroph. These observations were further supported by the facts that these proteins are involved in fungal development, and resemble metalloprotease and glucanase, respectively (Pliego et al., 2013). Taken together, these suggest that the numbers of CSSPs may not be correlated with fungal lifestyles.

In conclusion, different secretome components reflect lifestyle-associated genomic adaptations in fungi. Results from this comparative study provide new insights into the genetic basis and molecular evolution of fungal lifestyles and also establish a solid foundation for future discovery and functional validation of effectors.

### AUTHOR CONTRIBUTIONS

KK and YL designed this project. KK, JJ, HS and GC performed computational analyses. JC and KC provided the secretome and the genome data for analyses. KK, JJ, SK, and YL wrote the manuscript. SK and YL supervised the research. All authors read and approved the manuscript.

### ACKNOWLEDGMENTS

This work was supported by grants from the National Research Foundation of Korea (NRF-2014R1A2A1A10051434), funded by the Ministry of Science, ICT & Future Planning and the Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ01115401) administered by the Rural Development Administration, Republic of Korea. KK is grateful for a graduate fellowship from the Brain Korea 21 (PLUS) and Program.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00186

## REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Kim, Jeon, Choi, Cheong, Song, Choi, Kang and Lee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Common protein sequence signatures associate with *Sclerotinia borealis* lifestyle and secretion in fungal pathogens of the *Sclerotiniaceae*

#### *Edited by:*

*Delphine Vincent, Department of Environment and Primary Industries, Australia*

#### *Reviewed by:*

*Jana Sperschneider, Commonwealth Scientific and Industrial Research Organisation, Australia Kim Marilyn Plummer, La Trobe University, Australia*

#### *\*Correspondence:*

*Sylvain Raffaele, Laboratoire des Interactions Plante Micro-organismes, 24 Chemin de Borde Rouge – Auzeville, 31326 Castanet Tolosan, France sylvain.raffaele@toulouse.inra.fr*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 23 June 2015 Accepted: 10 September 2015 Published: 24 September 2015*

#### *Citation:*

*Badet T, Peyraud R and Raffaele S (2015) Common protein sequence signatures associate with Sclerotinia borealis lifestyle and secretion in fungal pathogens of the Sclerotiniaceae. Front. Plant Sci. 6:776. doi: 10.3389/fpls.2015.00776* Thomas Badet 1, 2, Rémi Peyraud1, 2 and Sylvain Raffaele1, 2 \*

*<sup>1</sup> Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441, Castanet-Tolosan, France, <sup>2</sup> Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594, Castanet-Tolosan, France*

Fungal plant pathogens produce secreted proteins adapted to function outside fungal cells to facilitate colonization of their hosts. In many cases such as for fungi from the *Sclerotiniaceae* family the repertoire and function of secreted proteins remains elusive. In the *Sclerotiniaceae*, whereas *Sclerotinia sclerotiorum* and *Botrytis cinerea* are cosmopolitan broad host-range plant pathogens, *Sclerotinia borealis* has a psychrophilic lifestyle with a low optimal growth temperature, a narrow host range and geographic distribution. To spread successfully, *S. borealis* must synthesize proteins adapted to function in its specific environment. The search for signatures of adaptation to *S. borealis* lifestyle may therefore help revealing proteins critical for colonization of the environment by *Sclerotiniaceae* fungi. Here, we analyzed amino acids usage and intrinsic protein disorder in alignments of groups of orthologous proteins from the three *Sclerotiniaceae* species. We found that enrichment in Thr, depletion in Glu and Lys, and low disorder frequency in hot loops are significantly associated with *S. borealis* proteins. We designed an index to report bias in these properties and found that high index proteins were enriched among secreted proteins in the three *Sclerotiniaceae* fungi. High index proteins were also enriched in function associated with plant colonization in *S. borealis*, and in *in planta*-induced genes in *S. sclerotiorum*. We highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase identified through our pipeline as candidate proteins involved in colonization of the environment. Our findings suggest that similar protein signatures associate with *S. borealis* lifestyle and with secretion in the *Sclerotiniaceae*. These signatures may be useful for identifying proteins of interest as targets for the management of plant diseases.

Keywords: secretome, *Sclerotinia*, psychrophily, effector candidates, amino acid usage, intrinsic disorder, antifreeze protein, lytic polysaccharide monooxygenase

### Introduction

Fungi from the Sclerotiniaceae family include several devastating plant pathogens with a broad host range. Among those are Botrytis cinerea, the causal agent of gray rot, and Sclerotinia sclerotiorum, causal agent of white and stem rot, each able to infect several hundreds of plant genera and causing multibillion dollar losses in agriculture every year (**Figure 1A**) (Bolton et al., 2006; Dean et al., 2012). The geographic distribution of these two fungi is also remarkably broad since they have been reported across five continents (**Figure 1B**). Sequencing of the genome of B. cinerea and S. sclerotiorum (Amselem et al., 2011) opened the way to systematic searches for the molecular bases of pathogenicity in these fungi (Guyon et al., 2014; Heard et al., 2015). However, the repertoire of molecules contributing to the ability of plant pathogenic fungi, such as fungi from the Sclerotiniaceae family, to colonize a wide range of hosts and environments remains elusive.

Fungal pathogens secrete diverse sets of degrading enzymes and toxins to facilitate colonization of their hosts (Möbius and Hertweck, 2009; Kubicek et al., 2014). In addition, fungal pathogens use molecules designated as effectors to manipulate host cells and achieve successful infection. Their activities include the inactivation of plant defenses, interference with plant hormone balance, or dismantling of the plant cell. However, effectors may also trigger specific plant defense responses, leading to plant resistance, when recognized directly or indirectly by the plant immune system (Jones and Dangl, 2006). Typical effectors are small secreted proteins, but secondary metabolites and small RNAs can also play the role of effectors (Schardl et al., 2013; Weiberg et al., 2013). Although a subset of bacterial and oomycete protein effectors can be identified based on conserved N-terminal targeting signals and other sequence signatures (Schornack et al., 2009; McDermott et al., 2011; Meyer et al., 2013), this is not the case in fungi. Effector detection in fungal pathogens relies largely on specific host responses revealing effector recognition, and bioinformatics approaches based on whole genome sequences and deduced protein repertoires remain challenging (Sperschneider et al., 2015). Genes involved in host-parasite interactions such as pathogen effectors are often subject to strong balancing or directional selection. For example, oomycete effectors commonly evolve rapidly, and natural selection can maintain many different alleles in a population (Raffaele et al., 2010; Oliva et al., 2015). Therefore, signatures of positive selection are frequent in effector genes and this property has been used to identify novel effector candidates (Wicker et al., 2013; Rech et al., 2014; Sperschneider et al., 2014). However, most of our understanding of the molecular evolution of effector genes and genes involved in colonization of the environment comes from studies of the pairwise coevolution of a given pathogen with a single host plant. By contrast, fungal pathogens in the Sclerotiniaceae interact with a wide range of hosts in multiple environmental conditions and should therefore be considered as evolving under "diffuse" (or "generalized") interactions (Juenger and Bergelson, 1998). In the Ascomycete genus Metarhizium, signatures of positive selection were observed less frequently in the genome of fungal pathogens

under diffuse co-evolution compared to Metarhizium acridum evolving under pairwise co-evolution (Hu et al., 2014). It is thus expected that in the Sclerotiniaceae, some genes important for colonization of environment, including fungal effectors involved in diffuse interactions, may escape detection by positive selection analysis, and additional detection criteria would be useful.

Compared to B. cinerea and S. sclerotiorum, the snow mold pathogen Sclerotinia borealis colonizes a reduced range of environments. Indeed, according to the Fungus-Host database of the U.S. Department of Agriculture (Farr and Rossman, 2015), S. borealis has been reported to infect 14 plant genera only, compared to 332 and 469 for S. sclerotiorum and B. cinerea respectively (**Figure 1A**). S. borealis host plants include notably Agropyron, Agrostis, Elymus, and Festuca species that have not been reported as hosts for S. sclerotiorum or B. cinerea to date. S. borealis has an economic impact in countries with cold climates, where it causes snow mold on winter cereals and grasses (Schneider and Seaman, 1987). Its geographic range is restricted to the Arctic Circle, including North of Japan, North America, Scandinavia, and Russia, whereas B. cinerea and S. sclerotiorum are cosmopolitan fungi found in arctic, temperate and tropical climates (**Figure 1B**). Consistently, S. borealis is a psychrophile, with an optimal growth temperature about 4–10◦C, whereas optimal growth temperature is ∼23◦C for B. cinerea and S. sclerotiorum (mesophiles) (Wu et al., 2008; Hoshino et al.,

2010; Judet-Correia et al., 2010). To successfully thrive in cold environments, psychrophilic pathogens must synthesize enzymes and effectors that perform effectively at low temperatures. Cold-temperature environments present several challenges, in particular reduced reaction rates, increased viscosity, and phase changes of the surrounding medium. A draft genome sequence of S. borealis strain F-4128 has recently been released (Mardanov et al., 2014a,b) providing an opportunity to better understand its adaptation to its ecological niche and particularly to cold environment. The total size of the assembled genome of S. borealis is 39.3 Mb, with a G+C content of 42%, including 10,171 predicted protein coding sequences (Mardanov et al., 2014a). These characteristics are similar for the genomes of S. sclerotiorum 1980 and B. cinerea B05.10 with total sizes of 38.3 Mb and 42.3 Mb respectively, G+C content of 41.8 and 43.1% respectively, and 14,503 and 16,448 predicted protein coding genes respectively (Amselem et al., 2011).

Cellular adaptations to low temperatures and the underlying molecular mechanisms are not fully understood but include membrane fluidity, the production of cold-acclimation and antifreeze proteins and maintenance of enzyme-catalyzed reactions and protein-protein interactions involved in essential cellular processes (Feller, 2003; Casanueva et al., 2010). Attempts to correlate protein thermal adaptation with sequence and structure derived features have accumulated with the multiplication of genomic sequencing programs. For instance, analysis of the complete predicted proteome of the psychrophilic bacterium Colwellia psychrerythraea supported the view that psychrophilic lifestyle probably involves specific sets of genes in addition to changes in the overall genome content and amino acid composition (Methé et al., 2005). Because microorganisms are at complete thermal equilibrium with their environment, it is indeed conceivable that adaptation to low temperature lead to global alterations of proteomes in psychrophiles. Comparative genomic and metagenomic analyses in prokaryotes demonstrated that the summed frequency of amino acids Ile, Val, Tyr, Trp, Arg, Glu, Leu (IVYWREL) correlates with optimal growth temperature (Zeldovich et al., 2007). In another study on bacteria, Ala, Asp, Ser, and Thr were found preferred in the genome of psychrophiles compared to mesophiles, whereas Glu and Leu are less frequent (Metpally and Reddy, 2009). The analysis of amino acid usage in thermophilic fungi showed that these species indeed have a higher total frequency of IVYWREL amino acids than their mesophilic relatives, but show also significant depletion in Gly and enrichment in Arg and Ala (Van Noort et al., 2013). At the structural level, cold environments seem to release selective pressure for stable proteins, and increase selection for highly active heat-labile enzymes, relying on improved intrinsic disorder to maintain optimal conformation dynamics (Feller, 2003, 2007). Besides these seemingly general principles and given the existence of psychrophiles in lineages across the tree of life, multiple mechanisms contributing to cold adaptation may exist.

For a fungal pathogen such as S. borealis, completion of its life cycle requires successful plant colonization, and a subset of secreted virulence factors is likely involved in essential cellular processes. Besides, secreted proteins in both yeasts and mammals were shown to evolve slightly faster than intracellular proteins (Julenius and Pedersen, 2006; Liao et al., 2010), suggesting that the search for signatures of adaptation to S. borealis lifestyle may help revealing proteins essential for host and environment colonization in the Sclerotiniaceae. In this work, we focused our analysis on adaptations to S. borealis environment that lead to alterations in core functions (genes and proteins) conserved in S. sclerotiorum and B. cinerea. We analyzed a set of 5531 groups of core orthologous proteins for amino acid usage and intrinsic protein disorder patterns specifically associated with S. borealis proteins. We highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase identified through our pipeline as candidate proteins involved in colonization of the environment. Our findings suggest that similar protein signatures associate with S. borealis lifestyle and with secretion in the Sclerotiniaceae. These signatures may be useful for identifying proteins of interest as targets for the management of plant diseases and for the bio-conversion of plant biomass.

### Results

### A Pipeline to Reveal *S. borealis* Protein Sequence Signatures in Multiple Ortholog Alignments

Several studies reported specific amino acid usage patterns and intrinsic disorder frequency in proteins from psychrophilic bacteria as compared to related mesophilic bacteria (Methé et al., 2005; Metpally and Reddy, 2009). To test whether S. borealis proteins have a distinctive pattern of amino acid usage and disorder compared to S. sclerotiorum and B. cinerea proteins, we designed a bioinformatics pipeline to process complete proteomes deduced from the whole genome sequences of these three fungal pathogens (**Figure 2**) (Amselem et al., 2011; Mardanov et al., 2014a). To exclude patterns that may be due to factors unrelated to adaptation in S. borealis proteins, we focused our analysis on core groups of orthologous proteins with one member from each species. A total of 6717 core orthologous groups (COGs) were identified using two pairwise InParanoid proteome comparisons (Ostlund et al., 2010) as explained in material and methods section and presented in **Figure 2**, covering between ∼42% (B. cinerea) to ∼66% (S. borealis) of complete predicted proteomes. We used multiple alignments of the three proteins in each COG to select S. sclerotiorum protein regions conserved in S. borealis and B. cinerea. To retrieve core protein regions conserved in all three members of COGs, we ran another round of InParanoid pairwise comparisons using conserved regions of S. sclerotiorum proteins as input. Short alignments can artificially cause strong variations in amino acid proportions. To reduce this confounding effect, we excluded alignments producing a consensus sequence shorter than 200 amino acids. We obtained a total of 5531 COG alignments matching these criteria that were processed for amino acid frequency and intrinsic protein disorder analysis.

### *S. borealis* Proteins Show Specific Intrinsic Disorder and Amino Acid Usage Patterns Compared to Their *Sclerotiniaceae* Orthologs

To document intrinsic protein disorder and amino acid usage in Sclerotiniaceae COGs, we calculated frequencies of each of the 20 amino acids in the aligned protein regions as well

as their disorder frequencies. Determination of the disorder frequencies were obtained by assigning to each amino acid of the aligned protein regions their disorder probability obtain by submitting the full length protein to disEMBL analyses (Linding et al., 2003). The disEMBL output contained three measures of intrinsic protein disorder designated as "Coils," "Hot loops," and "Remark465" corresponding to their probability to be involved in disorder region. To test whether any of these 20 amino acid frequencies plus 3 disorder metrics frequencies showed a significantly different distribution in S. borealis COG aligned regions compared to S. sclerotiorum and B. cinerea, we performed pairwise Wilcoxon sum rank tests to compare distributions of each of the 23 properties in S. borealis and S. sclerotiorum, in S. borealis and B. cinerea, and in S. sclerotiorum and B. cinerea (Table S1). We considered that a protein property was significantly associated with S. borealis COG aligned regions when Wilcoxon sum rank tests were significant (p < 0.05) for S. borealis—S. sclerotiorum and S. borealis—B. cinerea comparisons but not (p > 0.05) for S. sclerotiorum—B. cinerea comparison. The "hot loops" frequencies measure of intrinsic protein disorder was found significantly associated with S. borealis COG aligned regions, whereas "Coils" and "Remark465" were not (**Figure 3A**). "Hot loops," corresponding to protein regions predicted not to adopt helix or strand secondary structure and having a high degree of flexibility, were found significantly depleted in S. borealis COG aligned regions. S. borealis proteins had a median hot loop frequency of 3.43% in COG aligned regions, vs. 3.67% in S. sclerotiorum and 3.71% in B. cinerea proteins. Regarding frequency of amino acids, three were found significantly associated with S. borealis aligned COG regions. Thr frequency was significantly enriched, representing 6.00% of amino acids in S. borealis instead of 5.93% in S. sclerotiorum

and *S. sclerotiorum* orthologs (*p*-values shown along the Y-axis in red). Amino acid frequencies and intrinsic disorder probabilities that fell in the shaded areas were considered significantly different between *S. borealis* and the other fungi (*p* < 0.05) but not between *S. sclerotiorum* and *B. cinerea* (*p* > 0.05). These properties were

and 5.91% in B. cinerea proteins. Lys and Glu were significantly depleted in S. borealis. Lysine represented 5.26% of amino acids in S. borealis instead of 5.41% in S. sclerotiorum and B. cinerea proteins; Glu represented 6.43% of amino acids in S. borealis instead of 6.54% in S. sclerotiorum and 6.57% in B. cinerea proteins (**Figure 3B**). These findings are consistent with the view that cold adaptation includes the directional adaptation of preexisting protein functions (intrinsic protein structure and amino acid composition) in addition to specific sets of genes conferring a psychrophilic lifestyle, such as previously reported in bacteria (Methé et al., 2005).

considered as associated with *S. borealis* lifestyle.

### The Distribution of sTEKhot Index Is Biased in *S. borealis* Orthologous Proteins and Complete Predicted Proteome

Several studies reported biases in amino acid usage in the proteome of extremophiles and proposed indices able to discriminate proteins from extremophilic and related mesophilic organisms (Suhre and Claverie, 2003; Zeldovich et al., 2007; Wang and Lercher, 2010). To analyze the degree to which intrinsic protein disorder and amino acid usage of individual proteins matches with specific patterns identified in S. borealis predicted proteome, we designed the S. borealis T (Thr), E (Glu), K (Lys), hot (hot loops) index as follows:

$$\text{sTEKhot} = \frac{T}{E + K + hot} \tag{1}$$

where "T," "E," and "K" are the normalized frequencies of Thr, Glu and Lys respectively in a given protein sequence, and "hot" is the normalized frequency of hot loops in this sequence. We computed the sTEKhot index for each protein in the predicted proteomes of S. borealis, S. sclerotiorum, and B. cinerea. First, we compared the distribution of sTEKhot values in COGs by plotting all values in a ternary plot (**Figure 4A**). This revealed that sTEKhot values are distributed along an axis pointing toward S. borealis angle, clearly showing that sTEKhot values of S. borealis orthologs are major contributors to the structure of the dataset. There was 692 COGs in which S. borealis sTEKhot value accounted for >40% of the total sTEKhot for the COG, but only 388 and 345 in which S. sclerotiorum and B. cinerea sTEKhot values respectively accounted for >40% of the total sKTEHhot for the COG (**Figure 4A**). Consistently, S. borealis has the highest sTEKhot value in 42.7% of COGS (2761), whereas S. sclerotiorum and B. cinerea have the highest sTEKhot value in 28.3% (1845) and 28.8% (1865) of the COGs respectively (**Figure 4B**).

At the whole proteome level, sTEKhot median was 0.366 in S. borealis, but only 0.314 in S. sclerotiorum and 0.313 in B. cinerea (**Figure 4C,** Table S2). The overall sTEKhot distributions were significantly different when comparing S. borealis to the two other species (p < 5.1.e−104) but not when comparing S. sclerotiorum to B. cinerea (p = 0.84). However, a subset of S. sclerotiorum and B. cinerea proteins appeared to have high sTEKhot values. Indeed, as mentioned previously, S. sclerotiorum and B. cinerea each account for the highest sTEKhot in ∼30% of the COGs. Furthermore, the proportion of proteins with a sTEKhot > 1 was 6.2% in S. borealis, 4.6% in S. sclerotiorum and 5.0% in B. cinerea. This suggests that the general pattern of intrinsic protein disorder and amino acid usage observed in S. borealis protein may be shared by a subset of S. sclerotiorum and B. cinerea predicted proteome.

To verify that the sTEKhot index was an optimized combination of intrinsic protein disorder and amino acid usage

measures to discriminate the proteome of S. borealis from that of S. sclerotiorum and B. cinerea, we randomly shuffled the 23 measures for intrinsic protein disorder and amino acid usage in equation (1) and calculated the proteome median value for shuffled indices in S. borealis, S. sclerotiorum, and B. cinerea (Table S3). In 300 shuffling iterations, the p-value for difference between the distribution of shuffled index in S. borealis and S. sclerotiorum or B. cinerea was < 5.1.e−<sup>104</sup> (highest observed pvalue) in only 6 instances. The median shuffled index value for S. borealis proteome was higher than the observed sTEKhot median in only 2 instances over 300 (0.6%). Wilcoxon ranking tests comparing random medians distribution to real sTEKhot median showed p < 4.72e−<sup>33</sup> in the three species. The result of these simulations indicate that sTEKhot clearly departs from random in describing specific intrinsic protein disorder and amino acid usage patterns in S. borealis proteins.

### Secreted Enzymes are Enriched among *S. borealis* Proteins with High sTEKhot

To identify protein functions important for adaptation to S. borealis environment, we analyzed annotations of proteins with a sTEKhot value higher than 1 in S. borealis proteome. Overall, 4794 (47%) S. borealis proteins had no Gene Ontology (GO) annotation assigned. There were 635 proteins with sTEKhot > 1, among which 349 (55%) had no GO annotation. We looked for GO term enrichment in the 635 S. borealis with sTEKhot > 1 compared to all annotated proteins. Forty two GO terms appeared significantly enriched (p < 0.05) among proteins with sTEKhot > 1, including 16 leaves (GO with no child term) of the GO network (**Figure 5**). GO terms for "cellular component" enriched in proteins with sTEKhot > 1 included extracellular and cell wall compartments. Consistently, enriched "biological processes" and "molecular functions" related to secreted enzymes involved in cell wall modification (glycosyl hydrolases and carboxylic ester hydrolases, among which are pectinesterases and cutinases) and binding to cellulose. Cellulose is a major

component of plant cell walls that fungal pathogens are able to detect and bind. Also plants aerial parts are protected by a cuticle composed by cutin. Fungal pathogens are able to hydrolyze cutin through cutinases, thus facilitating host colonization. In addition, proteins involved in carbohydrate metabolism were enriched among proteins with sTEKhot > 1. These functions are associated with colonization of the environment, especially plantassociated environment. Similar enrichments where observed when looking at GO annotations for S. sclerotiorum and B. cinerea proteins harboring a sTEKhot > 1 (**Figures S1**, **S2**). In addition, copper ion binding GO was found to be enriched in S. sclerotiorum and B. cinerea.

### Secreted Proteins Have Higher sTEKhot Than Non-secreted Proteins in the Three *Sclerotiniaceae* Species

The enrichment of extracellular proteins among proteins with sTEKhot > 1 prompted us to compare the distribution of sTEKhot for secreted and non-secreted protein in the Sclerotiniaceae. We considered as predicted secreted proteins those identified as secreted with SignalP 4.0 no-TM network and as extracellular by WoLF PSORT. This produced lists of 667, 661, and 748 predicted secreted proteins (secretome) for S. borealis, S. sclerotiorum, and B. cinerea respectively. In all three fungal species, secreted proteins had significantly higher sTEKhot values than non-secreted proteins, with median sTEKhot values for secreted proteins of 1.13 in S. borealis, 1.06 in S. sclerotiorum and 1.08 in B. cinerea (**Figure 6A**). The distribution of sTEKhot in secreted proteins was found significantly higher than its distribution in non-secreted proteins with p-value of 8.8e−<sup>239</sup> in S. borealis, 9.1e−<sup>265</sup> in S. sclerotiorum and 4.1e−<sup>275</sup> in B. cinerea respectively. To evaluate the likelihood of obtaining such distributions with other intrinsic protein disorder and amino acid usage parameters, we randomly shuffled the 23 measures for intrinsic protein disorder and amino acid usage in Equation (1), and calculated shuffled indices for each protein in the predicted secretome in the three species. In 300 rounds of shuffling, the median secretome index was found higher than the observed median secretome sTEKhot in 3, 1 and 1 instance for S. borealis, S. sclerotiorum and B. cinerea respectively (Table S3).

Remarkably, although secreted proteins account for 6.5% of total proteome in S. borealis, 4.5% in S. sclerotiorum and 4.5% in B. cinerea, the proportion of secreted proteins among those with sTEKhot > 1.5 raised to 76.9% (206 out of 268) in S. borealis, 68.2% (182 out of 267) in S. sclerotiorum and 65.0% (206 out of 317) in B. cinerea, representing ∼13.6 fold enrichment in secreted proteins (**Figure 6B**). These results suggest that intrinsic protein disorder and amino acid usage patterns associated with S. borealis lifestyle and secretion are largely overlapping in the Sclerotiniaceae.

To independently validate this observation, we compared the distribution of all amino acid frequencies and the distribution of the three intrinsic protein disorder measures used previously in secreted and non-secreted proteins from the three fungal species. We considered that a protein property is associated with secretion when the null hypothesis of the Wilcoxon sum-rank test (distribution of property no different between secreted and

#### FIGURE 6 | Continued

*S. borealis*, *S. sclerotiorum* and *B. cinerea*. (B) Proportion of predicted secreted proteins according to sTEKhot cutoff values. In complete proteomes (sTEKhot ≥ 0), the proportion of secreted proteins is ∼5% in all three fungal proteomes, whereas among proteins with sTEKhot ≥ 1.5 (dotted line) it reaches an average ∼70%. (C) Proportion of whole proteomes and proteins with sTEKhot > 1.5 that are secreted, contain GPI-anchors, are N-glycosylated or contain transmembrane (TM) domains. Enrich., enrichment fold among sTEKhot > 1.5 as compared to whole proteomes.

non-secreted proteins) could be rejected with p < 0.05 for all three fungal species. Among the 23 measures for protein disorder and amino acid usage, 21 could be significantly associated with fungal secretomes, supporting the view that function outside the cell imposes specific constraints on amino acids usage in secreted proteins, such as evolution toward reduced synthetic cost of proteins (Smith and Chapman, 2010). Similar to patterns associated with S. borealis lifestyle, we found that enrichment in Thr, depletion in Glu and reduced frequency of hot loops disorder are among the properties most significantly associated with secretion (p-values ranging from 7.62e−<sup>3</sup> to 2.67e−194) (Table S4).

We considered several hypotheses to explain the observed common signatures for S. borealis lifestyle and secretion. First, we envisaged that prevalence of secreted proteins in COGs may have biased signatures of S. borealis lifestyle toward properties associated with secretion. However, ratios of secreted proteins in COG sets were similar to those observed for total proteomes (7% in S. borealis, 6.7% in S. sclerotiorum and 6.4% in B. cinerea proteins from the set of 5531 COGs). Furthermore, we excluded COGs that comprised secreted proteins and tested whether amino acid usage patterns associated with S. borealis proteins as previously. Amino acids enriched in S. borealis proteins included Thr and amino acids depleted in S. borealis included Glu and Lys (p < 0.05), similar to what we found in our initial analysis taking all COGs into account. In addition, we also found His enriched in S. borealis sequences and Asn depleted (p < 0.05). We conclude that the detection of a bias in the usage of these amino acids in S. borealis proteins was not due to the abundance of secreted proteins in COGs (Table S5). Second, we hypothesized that intrinsic protein disorder and amino acid usage in secreted proteins might be due to signal peptide regions. To test this, we analyzed protein properties associated with mature secreted proteins (signal peptide region removed). We found that mature secreted proteins had significantly higher sTEKhot than the rest of the proteome (p < 2.4.e−232), similar to what we found with full length secreted proteins (**Figure S3**). Therefore high sTEKhot in secretomes is not due to signal peptide sequence. Third, we considered that high sTEKhot in secretomes could arise if secretomes were be less divergent than the rest of the proteomes, leading to S. borealis signature being more conserved in secreted proteins of S. sclerotiorum and B. cinerea. To test this, we analyzed the distribution of similarity between S. borealis proteins and their closest homologs in S. sclerotiorum and B. cinerea. Whereas the average BLASTP score was 630.9 for S. borealis non-secreted proteins aligned with their closest homolog in S. sclerotiorum, this average score was 521.6 for S. borealis secreted proteins (**Figure S4**). This indicates that globally, S. borealis secretome is more divergent from S. sclerotiorum proteome than S. borealis non-secreted proteins. A similar tendency was observed when comparing S. borealis and B. cinerea proteomes. The high sTEKhot average observed in Sclerotiniaceae secretomes is therefore not due to higher similarity in secretomes compared to non-secreted proteins.

To test whether proteins with high sTEKhot could be enriched in other types of motifs, we predicted glycosylphosphatidylinositol (GPI) anchors, transmembrane (TM) domains and N-glycosylation sites in the proteome of S. borealis, S. sclerotiorum and B. cinerea. We found an average of 5.0% of proteins with GPI-anchors, 9.9% proteins with TM domains and 3.8% of proteins with >10 predicted N-glycosylation sites in the Sclerotiniaceae species (Table S6, **Figure 6C**). As compared to whole proteomes, the list of proteins with sTEKhot >1.5 showed an average 7.1-fold enrichment in proteins with GPI-anchors, 2.1-fold enrichment in proteins with >10 predicted N-glycosylation sites and no enrichment in proteins with TM domains (**Figure 6C**). Secreted proteins showed the strongest enrichment among proteins with sTEKhot >1.5. Overall these analyses suggest that a significant overlap exists between the constraints imposed on protein sequence by adaptation to S. borealis lifestyle and to secretion in the Sclerotiniaceae.

### *S. Sclerotiorum* Genes Encoding Proteins with High sTEKhot are Enriched in Genes Induced *in planta*

To further support the association between high sTEKhot index and colonization of the environment, and particularly host plants, we analyzed the distribution of sTEKhot values in S. sclerotiorum genes differentially regulated in planta. For this, we took advantage of S. sclerotiorum microarray gene expression data generated by Amselem et al. from infected sunflower cotyledons (Amselem et al., 2011). In this dataset, out of 14 503 predicted protein coding genes, 615 were induced at least two-fold during infection of sunflower (4.31%) and 458 genes down-regulated at least two-fold (3.21%). The proportion of genes induced in planta reached 27.1% of S. sclerotiorum genes encoding proteins with sTEKhot ≥ 2, representing ∼6.3-fold enrichment (**Figure 7**). The proportion of genes down-regulated in planta reached 12.1% of S. sclerotiorum genes encoding proteins with sTEKhot ≥ 2, representing ∼3.8-fold enrichment. S. sclerotiorum proteins with sTEKhot > 1 include six proteins with CFEM domain, a Cys-rich domain with proposed role in fungal pathogenesis, two proteins with a cerato-platanin domain, one of which being the ortholog of B. cinerea pathogen associated molecular pattern BcSpl1 (Frías et al., 2011), 27 proteins with a pectin lyase fold found in Aspergillus virulence factors (Mayans et al., 1997), and 29 out of 78 effector candidates proposed by Guyon et al. (2014). These findings are consistent with important role in the colonization of the host plant for some proteins with high sTEKhot values.

### High sTEKhot Index and Secretion Signal Reveal Candidate Proteins Associated with Colonization of the Environment

To illustrate the value of the sTEKhot index for the exploration of the proteome of fungi from the Sclerotiniaceae, we analyzed in detail the sequence of two proteins with high sTEKhot but with no assigned function. Over the three proteomes analyzed, S. borealis SBOR\_9046 had the highest sTEKhot (10.01). In S. sclerotiorum, its ortholog is SS1G\_10836 which ranked as the 5th highest sTEKhot in S. sclerotiorum (7.34). In B. cinerea, its ortholog is BC1G\_03854 which ranked as the 23rd highest sTEKhot in B. cinerea (4.29). No interproscan domain or GO terms were associated with these proteins of 171 amino acids (except SS1G\_10836 which is 173 amino acids long). To get insights into their putative function, we performed protein structure modeling and fold recognition using the I-TASSER server (Zhang, 2008). The closest structural analog was the antifreeze protein Maxi from winter flounder (Pseudopleuronectes americanus) (Sun et al., 2014). Although sequence similarity with Maxi was limited (from 15.2% identity for SBOR\_9046 to 16.2% identity for SS1G\_10836), superimposition of SS1G\_10836 predicted structure with Maxi structure showed a Root Mean Square Deviation < 2.3Å and a TM-score of 0.875, indicating structural similarity deviating significantly from random (**Figures 8A,B**). Analysis of SBOR\_9046, SS1G\_10836 and BC1G\_03854 sequence by TargetFreeze (He et al., 2015) supported the prediction as antifreeze proteins. The Sclerotiniaceae proteins contain four Cys residues located in the kink of predicted structures that may stabilize folding like, although these residues were not predicted to form disulfide bonds by Disulfind (Ceroni et al., 2006). Antifreeze proteins have been reporting that rely on disulfide bonds for folding (Basu et al., 2015) whereas others do not (Kondo et al., 2012; Sun et al., 2014). Like other known fungal antifreeze proteins (Kondo et al., 2012), but unlike Maxi, SBOR\_9046 and its orthologs are predicted to be secreted. A unique feature of Maxi among antifreeze proteins is the presence of ice-binding residues buried within the four-helix bundle instead of exposed on their surface (Sun et al., 2014). A prediction of SS1G\_10836 dimer structure supports the existence of rather hydrophilic pockets buried within the four-helix bundle, suggesting that the mechanism of ice binding of Maxi could be conserved in SS1G\_10836 and its orthologs (**Figure 8C**). To get insights into SS1G\_10836 function, we analyzed the expression of the corresponding gene in mycelium grown in Potato Dextrose Broth (PDB), during the colonization of Arabidopsis plants and in sclerotia by quantitative RT-PCR. This revealed a 3.3-fold induction (log<sup>2</sup> = 1.7) specific to sclerotia (**Figure 8F**). Since sclerotia overwinter in the soil, putative antifreeze proteins may contribute to survival of these structures both in arctic and temperate climates.

The COG including SS1G\_03146, BC1G\_07573, and SBOR\_1255 is remarkable for including three proteins with high (>1) but with very variable sTEKhot, ranging from 1.58 (SS1G\_03146) to 7.07 (BC1G\_07573). No interproscan domain or GO terms were associated with these proteins of 223 amino acids in average, but all three were predicted to include a N-terminal signal peptide for secretion. To get insights into their putative function, we performed protein structure modeling and fold recognition using the I-TASSER server (Zhang, 2008). The closest structural analog was Aspergillus oryzae AA11 (AoAA11) Lytic Polysaccharide Monooxygenase (LPMO) (Hemsworth et al., 2014). Sequence similarity with AoAA11 was limited (from 9.6% identity for SBOR\_1255 to 10.9% identity for SS1G\_03146), superimposition of SS1G\_03146 predicted structure with AoAA11 structure showed a Root Mean Square Deviation < 3.1Å and a TM-score of 0.677, indicating structural similarity deviating significantly from random (**Figures 8D,E**). Similar to the Sclerotiniaceae proteins, full length AoAA11 (accession number XM\_001822611) harbors a N-terminal signal peptide. AoAA11, SBOR\_1255, and BC1G\_07573 feature two conserved predicted disulfide bonds, SS1G\_03146 is predicted to contain only one (**Figure 8D**). The catalytic triad of AoAA11 appears nicely conserved in the Sclerotiniaceae proteins, with the exception of the catalytic Tyr replaced by a Ser in SS1G\_03146 (**Figure 8D**). LPMOs are enzymes oxidizing recalcitrant polysaccharides such as cellulose, starch and chitin. They present excellent potential for use in biomass conversion and the production of biofuels. Aspergillus oryzae AA11 represents a new class of LPMOs that include a putative chitin-binding domain (Hemsworth et al., 2014). We analyzed the expression of the SS1G\_03146 gene in mycelium grown in PDB, during the colonization of Arabidopsis plants and in sclerotia by quantitative RT-PCR. This revealed up to 9.5-fold induction (log<sup>2</sup> = 3.25) during plant infection (**Figure 8F**). This suggests that SS1G\_03146 may be involved in colonization of the

FIGURE 8 | Candidate proteins associated with colonization of the environment identified based on high sTEKhot values. (A) Multiple protein sequence alignment of *B. cinerea* BC1G\_03854 (sTEKhot = 4.29), *S. borealis* SBOR\_9046 (sTEKhot = 10.01), *S. sclerotiorum* SS1G\_10836 (sTEKhot = 7.34) and the hyperactive Type I antifreeze protein "Maxi" from *Pseudopleuronectes americanus* (4KE2\_A). (B) Superimposition of Maxi antifreeze protein structure (tan) and SS1G\_10836 model structure (rainbow). (C) Surface hydrophobicity of SS1G\_10836 model dimer. Dotted line corresponds to the position of the section shown on the right, illustrating the characteristic hydrophilic inner core of the dimer. (D) Multiple protein sequence alignment of *B. cinerea* BC1G\_07573 (sTEKhot = 7.07), *S. borealis* SBOR\_1255 (sTEKhot = 3.79), *S. sclerotiorum* SS1G\_03146 (sTEKhot = 1.58) and the AA11 Lytic Polysaccharide Monooxygenase from *Aspergillus oryzae* (4MAH\_A). (E) Superimposition of *A. oryzae* AA11 structure (tan) and SS1G\_03146 model structure (rainbow). (F) *SS1G\_10836* and *SS1G\_03146* gene expression *in vitro* (PDB, Potato Dextrose Broth), during colonization of *Arabidopsis thaliana* (lesion periphery and lesion center) and in sclerotia. Error bars show standard error of the mean from two independent biological replicates.

plant, but functional analysis will be required to determine its actual role.

Based on these predicted functions, we propose that SS1G\_10836 and SS1G\_03146 have important functions in the colonization of the environment, the identification of which was facilitated by the implementation of the sTEKhot index. Functional studies will be required to test predicted functions of these proteins. Furthermore, these two proteins have predicted properties that may be exploited for biotechnology purposes.

### Discussion

Understanding how fungal plant pathogens colonize their environment, including their host plants, is critical for food security and the sustainable management of ecosystems (Roux et al., 2014). In particular B. cinerea and S. sclerotiorum are threatening hundreds of plant species and important crop species in the majority of regions of the globe. Fungi also represent a remarkable reservoir of enzymes with very diverse catalytic abilities that are employed in industrial processes. We have conducted a comparative analysis of the proteome and secretome of fungal species from the Sclerotiniaceae revealing common principles of sequence optimization for secreted proteins.

In the present study we designed a bioinformatics pipeline aiming at identifying species-specific patterns of amino acid usage and intrinsic protein disorder in the proteome of closely related species. We applied this pipeline to agriculturally important fungal pathogens from Sclerotiniaceae family to reveal specific signatures associated with S. borealis lifestyle. Compared to S. sclerotiorum and B. cinerea orthologs, we observed in S. borealis proteins a significant increase in Thr usage and a significant decrease in Glu and Lys usage. To minimize the impact of phylogenetic distance on the definition of S. borealis sequence signature, we have restricted our analysis to species from the Sclerotiniaceae family and we discarded any sequence signature differing significantly between S. sclerotiorum and B. cinerea. It is also worth noting that S. borealis, S. sclerotiorum and B. cinerea have a very similar G+C content, so that G+C bias is not expected to have an impact on the differential usage of amino acids. Specific trends in amino acid composition have been reported to associate with protein stability at extreme temperatures. Given the diversity of ecological groups including psychrophiles, it has been challenging to identify universal trends in amino acids usage associated with cold adaptation (Casanueva et al., 2010). Enrichment in Thr has been reported in solvent-accessible areas of proteins from two cold-adapted Archaea (Goodchild et al., 2004) and in proteins from several psychrophilic bacteria (Metpally and Reddy, 2009). This was proposed to reduce surface charge while minimizing risk of aggregation (Goodchild et al., 2004). Frequent substitutions of Glutamate were observed in exposed sites of selected psychrophilic enzymes (Gianese et al., 2001) and more generally in the proteome of the psychrophilic Archea Halorubrum lacusprofundi (Dassarma et al., 2013). Glu is also part of a set of amino acids shown to correlate significantly with optimal growth temperature of prokaryotes (Zeldovich et al., 2007). Specific signatures of amino acid usage we found in S. borealis are therefore consistent with some previous observations made for psychrophilic proteins. Nevertheless, our approach does not allow dissociating psychrophily and other specific life traits of S. borealis (specific host range, geographic habitat) as drivers of the observed protein signatures. We observed a reduction in the frequency of intrinsic disorder in hot loops in S. borealis proteins. By contrast, cold adapted enzymes were often reported to harbor low conformational stability to maintain high reaction rates at low temperature (Feller, 2007; Casanueva et al., 2010) and intrinsically disordered proteins were shown to be more resistant to cold than globular proteins (Tantos et al., 2009). A global study of intrinsic protein disorder in 332 prokaryotes showed however that psychrophilic bacteria have a lower level of intrinsic disorder than mesophiles, although this was proposed to be due to the loss of cellular functions relying on intrinsically disordered proteins (Burra et al., 2010). This analysis also supports the view that adaptations to S. borealis lifestyle include directional changes in the sequence of conserved proteins, in addition to possible gene gains and losses that have not been analyzed in this work.

Enrichment analyses revealed that signatures associated with S. borealis lifestyle are frequent in plant cell wall degrading enzymes, carbohydrate binding domain containing proteins and ion binding proteins. More generally, secreted proteins showed high sTEKhot values in S. borealis, S. sclerotiorum and B. cinerea. The proportion of predicted secreted proteins reaches over 75% of S. borealis proteins with sTEKhot > 1.5 and the proportion of proteins encoded by in-planta induced genes reaches over 27% of S. sclerotiorum proteins with sTEKhot > 2, suggesting that sTEKhot may be a useful criterion to identify proteins associated with environmental adaptation or potential virulence factors. More specifically, there were 117 proteins predicted to be secreted and harboring a sTEKhot > 1.5 with no annotation in S. sclerotiorum that could include uncharacterized virulence factors. Although some classes of protein effectors from bacteria and oomycete pathogens can be identified relatively easily thanks to conserved N-terminal sequence signals, this strategy has proven limited for fungal pathogens. Alternative bioinformatics approaches have been developed exploiting known effector properties for searching effector candidates in the secretome of fungal pathogens (Saunders et al., 2012; Guyon et al., 2014). Typical effector properties include the presence of a N-terminal secretion signal, small protein size, high Cys content, the absence of characterized protein domains, high rate of non-synonymous over synonymous substitutions (Hacquard et al., 2012; Saunders et al., 2012; Persoons et al., 2014; Sperschneider et al., 2014). However, validated virulence factors do not all comply with these properties, such as Verticillium dahlia isochorismatase VdIsc1 harboring an isochorismatase domain but no conventional secretion signal (Liu et al., 2014) or Melampsora lini AvrM that lacks any Cys (Catanzariti et al., 2006).

Amino acid composition is a feature used to predict candidate bacterial effectors. Positive charge, richness in alkaline (H, R, K) amino acids and Glu in the 30 C-terminal amino acids is for instance a property often found in type IV secreted effectors (Meyer et al., 2013; Zou et al., 2013; Wang et al., 2014). In Pseudomonas syringae, amino acid biases and patterns at the N-terminus were used to identify type III effector candidates. Enrichment in Thr and depletion in Leu is a characteristic of bacterial type III proteins secreted into animal and plant cells, although high sequence variability and high tolerance of mutations make these properties difficult to generalize (Arnold et al., 2009; McDermott et al., 2011; Schechter et al., 2012). To identify novel effectors in Fusarium sp., Stagonospora nodorum, and Puccinia graminis f.sp. tritici fungi, Sperschneider et al. performed unsupervised clustering based on 35 sequence-derived features, including amino acid composition (Sperschneider et al., 2013, 2014). Several clusters were characterized by strong biases in amino acid usage, such as the cluster including the three S. nodorum effectors SnToxA, SnTox1 and SnTox3 enriched in small and non-polar amino acids and the cluster including F. oxysporum f. sp. lycopersici SIX3 featuring high average positive protein charge and a significantly higher percentage of Pro, Ser and Thr (Sperschneider et al., 2013). Similarly, secreted effectors of fungi from the Sclerotiniaceae family could be enriched in Thr and depleted in Glu and Lys compared to the rest of the proteome. This suggests that amino acid usage bias is a property that may be shared by sets of secreted proteins with unrelated function and from distant pathogen lineages. Consistent with Glu and Lys being disorder-promoting amino acids, we found that secreted proteins of Sclerotiniaceae species show lower disorder frequency in hot loops that the rest of the proteome. Effectors of bacterial pathogens were shown to be highly enriched in long disordered regions, presumably to facilitate effector translocation into the host cell, host function mimicry and evasion of the host immune system (Marín et al., 2013). Intrinsic protein disorder was shown to promote high specificity and low affinity protein-ligand interactions (Zhou, 2012; Chu and Wang, 2014). While these properties could be advantageous for host-specific effectors of biotrophic pathogens, for which avoiding detection by the host is critical, opposite requirements may shape the evolution of effectors from broadrange necrotrophic pathogens. Indeed, a relatively low specificity may allow effectors to function during colonization of diverse host species. It is also believed that detection by the host would not be detrimental, and could even be beneficial, to some necrotrophic plant pathogens (Govrin and Levine, 2000). In that case, effectors with high affinity for their targets would not be counter-selected by the host immune system, and would instead favor Sclerotiniaceae fungi in the competition with other microbes for plant-derived resources.

Cross species comparative analysis has been successfully applied to the identification of novel and specialized virulence mechanisms on the one hand, and to the identification of optimization principles governing the evolution of proteins under given constraints on the other hand. In nature, S. borealis proteins have undergone optimization under specific environmental constraints, including cold, over an irreproducible time at the scale of human life. Comparative genomics approaches therefore have the potential to reveal protein specialization and optimization principles that are not easily accessible through experimental evolution experiments. Indeed, selecting optimized enzyme variants, especially for thermostability, through random mutagenesis often requires exploring a large library of mutants or experimental setups maintaining an appropriate pressure of selection to collect the optimized variants (Kuchner and Arnold, 1997; Lebbink et al., 2000). Comparative genomics can accelerate discoveries usually relying on time consuming screens (Xiao et al., 2008). The biochemical properties of cold-active proteins make them attractive in biochemical, bioremediation, and industrial processes for food, biofuels and pharmaceutical production notably (Cavicchioli et al., 2011). Plant pathogenic fungi in particular present a vast reservoir of biopolymer degrading enzymes adapted to a wide range of temperatures and environments. Functional analyses will be required to test whether the activity of candidates highlighted in this work have applied potential. In the long term, the analysis of optimization principles governing the evolution of secreted proteins from important fungal pathogens may prove useful in improving plant health with the design of crops resistant to broad host range pathogens and to cold stress, and to develop novel strategies for the production of renewable energy relying on the bio-conversion of plant biomass.

### Materials and Methods

### Genome Sources

We retrieved three predicted proteomes (Sclerotinia sclerotiorum v1.0, Botrytis cinerea v1.0 and Sclerotinia borealis F-4157) from the Joint Genome Institute (http://jgi.doe.gov/) and NCBI (http://www.ncbi.nlm.nih.gov/) in fasta format. As a cautionary note: the proteome sequences that form the basis of our analyses had originally been predicted by various techniques and may thus be of varying quality and completeness. S. sclerotiorum gene expression data was obtained from http://urgi.versailles.inra.fr/ Data/.

### Gene Ontology Annotation and Enrichment Analysis

The Gene Ontology was collected from the Gene Ontology Consortium website (http://geneontology.org/) in obo format. Assignment of the Gene Ontology annotation of the three sets of protein sequences was performed using InterProScan (Jones et al., 2014). GO enrichments analysis was performed using the Biological Networks Gene Ontology plug-in (Maere et al., 2005) in Cytoscape 3.2.1 with the following parameters: a hypergeometric test for statistical analysis with a Bonferroni Family-Wise Error rate correction and a significance level of 0.05.

### Ortholog Prediction

Ortholog prediction was performed with standalone InParanoid 4.0 (Ostlund et al., 2010) using all vs. all Basic Local Alignment Search Tool (BLAST) algorithms and the following parameters: the BLOSUM62 matrix, a score cut-off of 50 bits and a minimal sequence overlap area of 0.5 (Altschul et al., 1990; Remm et al., 2001). Two pairwise InParanoid comparisons (S. borealis vs. S. sclerotiorum and S. borealis vs. B. cinerea) were ran first on complete proteomes, leading to the identification 6717 COGs, then using only conserved regions of S. sclerotiorum proteins ("overlapping regions") as input (**Figure 2**). Finally alignments producing a consensus sequence shorter than 200 amino acids were excluded leading to 5531 COGs.

### Pipeline for Collecting Multiple Ortholog Alignments

First, ortholog predictions were performed as described in previous section between one organism, called reference organism in the following (S. sclerotiorum), and each other organism included in the analysis (B. cinerea and S. borealis). Only core groups of orthologous proteins harboring one member from each species were retained. Then, the common overlapping sequences in the reference organism to the others organisms were selected according to BLAST begin and end alignment positions. The maximal begin and the minimal end were used to defined the overlapping sequences. Overlapping sequences with lower than 200 amino acids length were excluded. The obtained overlapping sequences in the reference organism were used to run a new round of ortholog prediction with each other organisms. The consensus sequences, or core ortholog groups alignments, in each organisms were selected accordingly to BLAST begin and end alignment positions using the minimal begin and the maximal end obtained through the all orthologs predicted. The consensus sequences with lower than 200 amino acids length were excluded.

### Amino Acid and Disorder Analysis

Protein amino acid usage was assessed by calculating the frequency of each of 20 amino acids in protein sequences. Prediction of disorder probability of protein amino acid was performed with DisEMBL vs. 1.4 computational tool (Linding et al., 2003) on the full length proteins. In case of analysis of a protein sequence subset, like for the core ortholog groups alignments (see previous section), the disorder probability of each amino acid in the subset were taken from the disorder probability of this amino acid in the full length protein. This was done to avoid miss attribution of disorder probability in a subset of a sequence since surrounding of amino acid in the sequence are of importance to calculate its own disorder probability.

### Secretome Prediction and Protein Motif Annotation

Analysis by SignalP4.1 was performed at http://www.cbs.dtu.dk using default parameters. Protein localization was predicted with PSORT II software using the WoLF PSORT extension (Horton et al., 2007) for organism type "fungi." Proteins were defined as part of the secretome when containing both signal peptide and extracellular predicted localization and were excluded if they possess a trans-membrane region predicted by TMHMM (Sonnhammer et al., 1998). Glycosylphosphatidylinositol anchored proteins were identified using Fraganchor (Poisson et al., 2007); N-glycosylation sites were predicted using GlycoEP (Chauhan et al., 2013).

### Statistical Analysis and sTEKhot Index Determination

All statistical tests were computed with R.Studio software. Wilcoxon test was used for significance analysis. Difference was considered significant for p-values inferior to 0.05. Significantly enriched or depleted amino acids and disorder frequency in S. borealis common set of core ortholog groups' alignments compared to S. sclerotiorum and B. cinerea core ortholog groups alignments, but found to be not significantly different between S. sclerotiorum and B. cinerea, were further used for computing the environmental condition adaptation index (sTEKhot). Thr frequency (T<sup>f</sup> ) found to be over represented in S. borealis were added to the numerator of the index, whereas Lys (K<sup>f</sup> ), Glu (E<sup>f</sup> ) and hot loops (HotLOOP<sup>f</sup> ) frequencies found to be under represented were added to the denominator. Each metrics were normalized by their own median (Xmf, where X is the considered metric) through the all set of proteome used in the analysis (S. borealis plus S. sclerotiorum plus B. cinerea). This normalization assures similar contribution of each metrics to the index.

$$\text{sTEKhot} = \frac{\frac{T\_f}{T\_{mf}}}{\frac{K\_f}{K\_{mf}} + \frac{E\_f}{E\_{mf}} + \frac{HotLOOP\_f}{HotLOOP\_{mf}}} \tag{2}$$

sTEKhot value was calculated for every protein of the three proteomes according to (2). The list of proteins with the top 635 sTEKhot (>1) corresponded exactly to proteins with the top Tf -(Ef+Kf+HotLOOP<sup>f</sup> ) values supporting the robustness of the arithmetic design of the sTEKhot index in this dataset.

### Random Shuffling of sTEKhot

Random sTEKhot indexes were calculated by shuffling amino acid and hotloop frequencies in Equation (2) with any of the observed amino acid and hotloop frequencies for a given organism. The random index is therefore defined by Equation (3) in which W, X, Y, and Z are randomly selected observed frequencies.

$$RANDOMindex = \frac{\frac{X\_f}{X\_{mf}}}{\frac{Y\_f}{Y\_{mf}} + \frac{Z\_f}{Z\_{mf}} + \frac{W\_f}{W\_{mf}}} \tag{3}$$

Indexes were calculated separately for the three proteomes and secretomes. Random sTEKhot medians and Wilcoxon ranking test p-values were extracted from 300 independent runs.

### Protein Structure Modeling and Analysis

Protein structure modeling was performed with the I-TASSER server (Zhang, 2008) using SS1G\_10836 and SS1G\_03146 full length sequences as queries. SS1G\_10836 best model C-score was -3.22; best TM score was 0.875 (RMSD 2.27Å) with model 4KE2. SS1G\_03146 best model C-score was -2.28; best TM score was 0.677 (RMSD 3.07Å) with model 4MAH.

#### Gene Expression Analysis

One-centimeter long leaves were collected and grinded twice for 30 s at maximum frequency in a Retsch MM40 mixer. Total RNA extraction was performed with Macherey-Nagel Nucleospin RNA extraction kit following the manufacturer's instructions. One µg of total RNA was used for cDNA synthesis in a 20-µL reaction according to Roche Transcriptor Reverse Transcriptase protocol, using 0.5µL of SuperScript II reverse transcriptase (Invitrogen), 1µg of oligo(dT), and 10 nmol of dNTP. cDNAs (diluted 1:10) were used as templates in the quantitative RT-PCR analysis. Quantitative RT-PCR was performed using genespecific primers (Table S6) with LightCycler 480 apparatus (Roche Diagnostics). Quantitative PCR reaction was performed using the SYBR GREEN I protocol (5 pmol of each primer and 5µL of RT reaction product in a 7µL final reaction volume). The PCR conditions were 9 min at 95◦C, followed by 45 cycles of 5 s at 95◦C, 10 s at 65◦C, and 20 s at 72◦C. Expression values of SS1G\_10836 and SS1G\_03146 were normalized based on expression of SS1G\_04652 and SS1G\_12196 housekeeping genes. Values from two biological replicates are shown, error bars show standard error of the mean.

### Author Contributions

TB, RP, and SR designed and performed analyses. SR conceived the study. TB, RP, and SR wrote the manuscript.

### Acknowledgments

This work was supported by a Starting Grant of the European Research Council (ERC-StG 336808 project VariWhim) and a Marie Curie grant (MC-CIG 334036 project SEPAraTE) to SR and the French Laboratory of Excellence project TULIP (ANR-10-LABX-41; ANR-11-IDEX-0002-02). We thank the

### References


BBRIC computational facilities for providing bioinformatics tools.

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00776

Figure S1 | Network representation of gene ontologies (GOs) of proteins with sTEKhot >1 in *S. sclerotiorum* proteome. Nodes correspond to GOs are sized according to the number of proteins with sTEKhot >1. They are colored from yellow to orange according to the *p*-value of a hypergeometric test for enrichment in proteins with sTEKhot >1 compared to whole proteomes.

Figure S2 | Network representation of gene ontologies (GOs) of proteins with sTEKhot >1 in *B. cinerea* proteome. Nodes correspond to GOs are sized according to the number of proteins with sTEKhot >1. They are colored from yellow to orange according to the *p*-value of a hypergeometric test for enrichment in proteins with sTEKhot >1 compared to whole proteomes.

Figure S3 | Distribution of sTEKhot values for non-secreted proteins and mature secreted proteins (signal peptide removed) in *S. borealis*, *S. sclerotiorum* and *B. cinerea*.

Figure S4 | Distribution of best BlastP bit scores (log-scaled scores) using *S. borealis* non-secreted proteins and secreted proteins as queries against *S. sclerotiorum* or *B. cinerea* proteomes. Lower scores for searches using *S. borealis* secretome as query indicate that *S. borealis* secreted proteins are less conserved than non-secreted proteins. *P*-values of a Student *t*-test for differences between non-secreted and secreted proteins are indicated.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Badet, Peyraud and Raffaele. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Extracellular vesicles including exosomes in cross kingdom regulation: a viewpoint from plant-fungal interactions

Monisha Samuel <sup>1</sup> , Mark Bleackley <sup>2</sup> \*, Marilyn Anderson<sup>2</sup> \* and Suresh Mathivanan<sup>2</sup> \*

*<sup>1</sup> Department of Physiology, Anatomy and Microbiology, School of Life Sciences, La Trobe University, Melbourne, VIC, Australia, <sup>2</sup> Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia*

Keywords: exosomes, plant-fungal interaction, extracellular vesicles, cross kingdom regulation, secretome

#### Edited by:

*Dominique Job, Centre National de la Recherche Scientifique, France*

#### Reviewed by:

*Marcio L. Rodrigues, Oswaldo Cruz Foundation, Brazil David John Studholme, University of Exeter, UK*

#### \*Correspondence:

*Mark Bleackley, Marilyn Anderson and Suresh Mathivanan m.bleackley@latrobe.edu.au; m.anderson@latrobe.edu.au; s.mathivanan@latrobe.edu.au*

#### Specialty section:

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

Received: *29 July 2015* Accepted: *07 September 2015* Published: *23 September 2015*

#### Citation:

*Samuel M, Bleackley M, Anderson M and Mathivanan S (2015) Extracellular vesicles including exosomes in cross kingdom regulation: a viewpoint from plant-fungal interactions. Front. Plant Sci. 6:766. doi: 10.3389/fpls.2015.00766* Throughout evolution, plants and pathogenic fungi have been in a constant battle where fungi have developed new mechanisms to infect plants while plants have co-evolved to combat the infection. The early stages of plant-pathogen interactions occur in the intercellular spaces of the plant tissue and thus involve a myriad of secreted factors. Traditionally, all proteins released into the extracellular space were thought to be transported via the ER-Golgi dependent classical secretory pathway. However, non-classical secretion of proteins/RNA through extracellular vesicles (EVs) has recently been reported to contribute to the milieu of extracellular molecules that mediate plantfungal interactions (Rodrigues et al., 2007; Meyer et al., 2009). EVs can be broadly classified into exosomes and ectosomes (Keerthikumar et al., 2015). Exosomes are secreted microvesicles (30– 150 nm in diameter) of endocytic origin that are released by multiple cell types and are conserved across various species (Lotvall et al., 2014; Gangoda et al., 2015). In contrast, ectosomes or shedding microvesicles are larger (100–1000 nm in diameter) and bud off directly from the plasma membrane (Keerthikumar et al., 2015). For clarity, we will collectively refer to both types of membranous vesicles as EVs in this article.

Recent studies on mammalian systems have highlighted the role of EVs in cell-cell communication and the intercellular transport of cargo (proteins, nucleic acids, and carbohydrates) (Batista et al., 2011; Cossetti et al., 2014). Whilst the role of EVs in plant-fungal interactions is still poorly defined, this non-canonical secretory pathway has been proposed as an alternative route for the secretion of virulence and defense molecules by fungi and plants, respectively (Robatzek, 2007; Rodrigues et al., 2011). The basic requirement for successful host colonization is the establishment of a parasitic relationship between the fungal pathogen and the host. This requires the induction of specific defense mechanisms in the fungus for protection against the plant innate immune system (Hayes et al., 2013). Evasion or suppression of the plant defense response is thought to be regulated by virulence factors that are secreted from the fungus and act at the plasma membrane or in the cytoplasm of the plant cell (Rodrigues et al., 2008a). Interestingly, recent studies allude to the EVmediated transport of virulence factors from the fungus into the host cell as a more efficacious delivery mechanism than simple diffusion (Rodrigues et al., 2008a; Silverman and Reiner, 2011). Similarly, in plants, when the integrity of the cell wall is threatened by a fungal pathogen, a response is mediated, at least in part, by multivesicular bodies (MVBs) (An et al., 2006b). In mammalian cells, it is well documented that fusion of MVBs with the plasma membrane results in the secretion of exosomes (Boukouris and Mathivanan, 2015; Gangoda et al., 2015). Though the production of MVBs may not always result in the secretion of EVs, the observation that plants produce MVBs in response to a fungal infection leads to the speculation that EVs may play a critical role in plant-fungal interactions. Here, we will discuss the current knowledge on EVs in the context of human-fungal interactions and their potential roles in plant-fungal interactions.

## Role of EVs in Human-fungal Interactions

Fungal EVs were first isolated from the human fungal pathogen Cryptococcus neoformans (Rodrigues et al., 2007). These EVs contained well known virulence factors such as the capsular polysaccharide glucuronoxylomannan (GXM) and the virulence regulator, glucosylceramide (Rodrigues et al., 2008a). Rodrigues and colleagues also reported the presence of several other pathogenicity-associated components that are delivered into the host via EVs. Furthermore, the isolated EVs were biologically active as they could invigorate phagocytes in the host and enhance their antimicrobial activity (Oliveira et al., 2010a). Other mammalian fungal pathogens including Histoplasma capsulatum, Candida parapsilosis, Sporothrix schenckii, and Candida albicans also deliver a variety of effector molecules in a similar manner (Albuquerque et al., 2008; Vargas et al., 2015; Gil-Bona et al., 2015b). Interestingly, the serum from patients with H. capsulatum infections contains antibodies to proteins that are present in the EVs produced by the pathogen indicating involvement of EVs in the host-pathogen interaction. Moreover, characterization of EVs from the human pathogens C. neoformans, H. capsulatum and Malassezia sympodialis has implicated them in the modulation of the host immune system and regulation of the host-pathogen interaction in favor of the fungus (Rodrigues et al., 2008b; Gehrmann et al., 2011).

### Role of EVs in Plant-fungal Interactions

A major component of plant-fungal interactions is the secretion of small proteins by both organisms. Plants produce pathogenesis related (PR) proteins, many of which inhibit fungal growth or directly kill fungal cells (Sels et al., 2008). Fungi secrete virulence factors encoded by the avirulence (AVR) genes (Stergiopoulos and De Wit, 2009; Rodrigues et al., 2014; Gil-Bona et al., 2015a). However, in spite of decades of research, it is still unclear as to how these proteins cross the plasma membranes and cell walls of both species. The AVR genes AVRa10 and AVRk1of the fungal pathogen Blumeria graminis f. sp. hordei encode proteins that lack signal peptides. Despite the lack of classical secretion signal, they still enter the cells in a susceptible host plant and are required for the pathogenicity of the fungus (Ridout et al., 2006). In mammalian systems, it is well established that certain proteins with and without signal peptides are transported via EVs (Kalra et al., 2012; Simpson et al., 2012). Hence, EVs could potentially mediate the transfer of these fungal virulence factors into plant hosts (Rodrigues et al., 2008b). However, further studies are required to understand this highly complex phenomenon.

The ability of a plant to mount a rapid defense response against potential pathogens is vital to its survival. Intercellular organelle rearrangements and structural modulation of the cytoskeleton with increased focal secretion of compounds lead to the formation of a physical barrier at the attack site that might prevent successful infection (Frey and Robatzek, 2009). These modifications may involve rapid and targeted delivery of molecules via EVs. Fungal infection enhances the formation of both intracellular MVBs and paramural vesicles between the plasma membrane and cell wall in the plant cells indicating the critical role of this secretory pathway in plant innate immune response (An et al., 2006a; Wang et al., 2014). For example, vesicular structures were associated with the accumulation of phenolic compounds and H2O<sup>2</sup> that prevented pathogenic establishment of the powdery mildew fungus B. graminis in barley (Hordeum vulgare) leaves. Though the detection of plant MVBs at the site of infection provides indirect evidence of their role in plant defense (An et al., 2006a), more work is needed to define their molecular composition and whether they do indeed transport innate immunity proteins to the site of infection and/or into the fungal cell.

MVBs were first reported in the appressoria and haustoria of the powdery mildew fungus B. graminis (Hippe, 1985; Hippe-Sanwald et al., 1992). More recently, microscopic examination confirmed the presence of membrane bound vesicles at the biotrophic interface between B. graminis and the host plant (Micali et al., 2011). Furthermore, the haustorial complexes produced by Golovinomyces orontii in infected Arabidopsis leaves have vesicles and MVBs in the haustorial body (cytoplasm), paramural space, and extrahaustorial matrix. In addition, vesicle budding and fusion of MVB-like structures with the fungal plasma membrane has been observed in this interaction, although the microscopic evidence did not reveal whether the vesicles were derived from the plant or the fungus (Micali et al., 2011). In other studies, EVs from fungal pathogen Paracoccidioides brasiliensis and H. capsulatum were reported to transport antioxidants (superoxide dismutase and catalase B) and heat shock proteins (Hsp60 and Hsp70) which may have an essential role in the fungal defense mechanism (Albuquerque et al., 2008; Vallejo et al., 2012). Only recently, Hsp60 was also reported in the proteome of EVs from the fungus Alternaria infectoria. Several species of this genus Alternaria are also considered major plant pathogens (Silva et al., 2014).

Biochemical analyses of EVs from various human fungal pathogens has revealed the presence of a variety of lipids, proteins and RNA (Peres Da Silva et al., 2015). Although observed by microscopy, EVs have not been isolated and characterized from a plant pathogenic fungus. Nevertheless, upon uptake by the plant cell, it is possible that the contents could modulate the plant response to the invading fungal pathogen by attenuating the immune response. Similarly, plant exosomes are poorly characterized and their composition is largely unknown. However, it is plausible that some plant exosomes, particularly those produced in response to fungal threat, might contain small molecules and proteins that are toxic to the fungus.

Finally, cell wall remodeling is a key process on both sides of the plant-fungus interaction (Bellincampi et al., 2014). Proteomics studies highlighted the presence of various enzymes (Endochitinase 1 precursor, Beta-glucosidase 4, Beta-1,3-glucanosyltransferase 3, and Chitin synthase B) in EVs secreted by H. capsulatum. Similarly, S. cerevisiae secreted more than 20 proteins implicated in cell wall assembly including glucanases and glucanosyl transferases (Oliveira et al., 2010b).

are formed from the early endosomes. Within the MVBs, invagination of the limiting membrane results in the formation of intraluminal vesicles which are packaged with protein and RNA cargo from the cell. The MVBs either fuse with the plasma membrane or with the lysosome for degradation. When the MVBs fuse with the plasma membrane, the intraluminal vesicles are released as exosomes. The exosomes are considered to contain various molecules including effectors that are required for the establishment of the pathogen and/or infection. Exosome biogenesis and secretion in the plant side: Similarly, on the plant side, vesicles from the MVBs may contain innate immunity proteins and defense molecules that can impede fungal growth or lead to alterations in the fungal cell wall. Thus, the plant and its fungal counterpart could utilize the exosomes as one of the many strategies in their mutual struggle for survival.

These enzymes have the capacity to regulate synthesis and hydrolysis of cell wall components highlighting the potential role of EVs in cell wall remodeling (Albuquerque et al., 2008). Fungal cell wall synthesis is also known to be mediated by chitosomes, small vesicles containing chitin microfibrils, in Neurospora crassa (Riquelme et al., 2007). Chitosomes follow an unconventional secretory pathway to transport various components of the chitin synthase family required for fungal cell wall synthesis. Whilst the differences between chitosomes and EVs are not clearly understood, it can be speculated that EVs can also play a pivotal role in fungal cell wall remodeling. In plants, reinforcement of the cell wall is one of the major strategies of the host to restrain further invasion by the pathogen (Lionetti and Métraux, 2014). The delivery of the cell wall carbohydrates to the extending chains of insoluble polysaccharides that make up majority of the cell wall is relatively poorly understood. The role of EVs in cell wall remodeling in both the fungus and plant is understudied and needs further research.

### Conclusion

Recent findings pertaining to the role of EVs in the interaction between fungal pathogens and humans have led us to ask whether EVs also have a major role in plant pathogen interactions. It is still unknown how effectors and defense molecules are packaged and transported across the plasma membranes and cell walls of the plant and fungal cells. We propose that proteins lacking secretion signals could be packaged into EVs for passage through the plasma membrane and the cell wall (**Figure 1**). Alternatively, proteins containing a secretion signal could be secreted into the matrix of the cell wall and then bind to EVs via a lipid binding

### References


motif. The protein then transits the cell wall as a passenger on the outer leaflet of the vesicle. Based on the discovery that EVs aid disease progression (Boukouris and Mathivanan, 2015; Gangoda et al., 2015), we propose that EVs can mediate/aid in fungal infection. This could be achieved via the transfer of effectors via EVs and/or by modulating the host cells response in favor of the fungal pathogen. Similarly, we propose that plant EVs can aid in the protection against pathogenic infections. Upon infection, defense molecules can be packaged and delivered to the site of infection to protect against the invading pathogen. Further to this, we propose that the molecular cargo present in EVs is specific to the type of insult or infection. For instance, molecular cargo present in EVs of plants during stress can be significantly different to that produced during fungal infection. Thus the molecular cargo contained within EVs of plant or fungus can serve as indicators of health, stress, and disease. Investigation of the role of EVs in plant-fungal interactions is likely to uncover a new mechanism for delivery and identification of molecules required for a productive infection and/or defense response. This knowledge will enhance our ability to protect agricultural crops against the damaging effects of fungal pathogens and securing our food sources for generations to come.

### Acknowledgments

SM is supported by the Australian Research Council Discovery project grant (DP130100535) and Australian Research Council DECRA (DE150101777). MA is supported by the Australian Research Council Discovery project grant (DP150104386). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Ifngr1 to activate Stat1 signaling in target cells. Mol. Cell 56, 193–204. doi: 10.1016/j.molcel.2014.08.020


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Samuel, Bleackley, Anderson and Mathivanan. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Candidate effector proteins of the necrotrophic apple canker pathogen *Valsa mali* can suppress BAX-induced PCD

### *Zhengpeng Li†, Zhiyuan Yin†, Yanyun Fan, Ming Xu, Zhensheng Kang and Lili Huang\**

*State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, China*

#### *Edited by:*

*Maryam Rafiqi, Computomics GmbH & Co KG, Germany*

#### *Reviewed by:*

*Weixing Shan, Northwest A&F University, China Kim Marilyn Plummer, La Trobe University, Australia*

#### *\*Correspondence:*

*Lili Huang, State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China huanglili@nwsuaf.edu.cn*

> *†These authors have contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 13 December 2014 Accepted: 13 July 2015 Published: 27 July 2015*

#### *Citation:*

*Li Z, Yin Z, Fan Y, Xu M, Kang Z and Huang L (2015) Candidate effector proteins of the necrotrophic apple canker pathogen Valsa mali can suppress BAX-induced PCD. Front. Plant Sci. 6:579. doi: 10.3389/fpls.2015.00579* Canker caused by the Ascomycete *Valsa mali* is the most destructive disease of apple in Eastern Asia, resulting in yield losses of up to 100%. This necrotrophic fungus induces severe necrosis on apple, eventually leading to the death of the whole tree. Identification of necrosis inducing factors may help to unravel the molecular bases for colonization of apple trees by *V. mali.* As a first step toward this goal, we identified and characterized the *V. mali* repertoire of candidate effector proteins (CEPs). In total, 193 secreted proteins with no known function were predicted from genomic data, of which 101 were *V. mali*-specific. Compared to non-CEPs predicted for the *V. mali* secretome, CEPs have shorter sequence length and a higher content of cysteine residues. Based on transient over-expression in *Nicotiana benthamiana* performed for 70 randomly selected CEPs, seven *V. mali* Effector Proteins (VmEPs) were shown to significantly suppress BAX-induced PCD. Furthermore, targeted deletion of *VmEP1* resulted in a significant reduction of virulence. These results suggest that *V. mali* expresses secreted proteins that can suppress PCD usually associated with effector-triggered immunity (ETI). ETI in turn may play an important role in the *V. mali*–apple interaction. The ability of *V. mali* to suppress plant ETI sheds a new light onto the interaction of a necrotrophic fungus with its host plant.

Keywords: apple *Valsa* canker, secreted protein, cell death, virulence factor, plant–fungus interaction

### Introduction

The Apple *Valsa* canker fungus *Valsa mali* is a necrotrophic pathogen inducing severe necrosis on apple. It is the most devastating pathogen of apple in Eastern Asia, causing severe yield losses each year (Lee et al., 2006; Li et al., 2013). This pathogen preferentially infects apple although it is also aggressive to other Rosaceae woody plants such as pear, crabapple, apricot and peach (Wang et al., 2011b; Zhou et al., 2014). *V. mali* is considered a necrotroph that completes its life cycle on dead plant cells killed prior to or during colonization (Ke et al., 2013; Yin et al., 2015). Particularly, genes involved in plant cell wall degradation and toxin synthesis are remarkably expanded in the *V. mali* genome and are commonly up-regulated during infection (Ke et al., 2014; Yin et al., 2015). However, it is becoming more and more evident that interactions between necrotrophs and their hosts are considerably more complex and subtle than previously thought. Functional analysis of the *V. mali* genome showed that 193 secreted proteins have no known annotation, of which about 101 are *V. mali*-specific. Given that phytopathogens often secrete a series of proteins into the host–pathogen interface to manipulate host cell physiology and ultimately promote infection, functional verification of these secreted proteins is of particular interest for identifying potential virulence factors.

Effectors play an important role in the host–pathogen interface during infection (Giraldo and Valent, 2013; Vleeshouwers and Oliver, 2014). It is generally accepted that biotrophs actively suppress programmed cell death (PCD) of the host, whereas necrotrophs are believed to promote cell death to enhance colonization (Donofrio and Raman, 2012; Mengiste, 2012). Typically, most of the known effectors secreted by necrotrophic fungi (e.g., *Parastagonospora nodorum*, *Pyrenophora tritici-repentis*, *Alternaria alternata,* or *Cochliobolus heterostrophus*) lead to effector-triggered susceptibility (ETS) resulting in host cell death (reviewed in Wang et al., 2014). Intriguingly, the effector protein SSITL of *Sclerotinia sclerotiorum* can suppress the jasmonic/ethylene (JA/ET) signaling pathway mediated resistance at an early stage of infection (Zhu et al., 2013). In addition, oxalic acid secreted by *S*. *sclerotiorum* inhibits host autophagy which constitutes an effective defense response in this necrotrophic fungus–plant interaction (Kabbage et al., 2013). These evidences suggest that effectors of necrotrophic fungi may also suppress host defense responses, rather than induce cell death.

Programmed cell death triggered in plants by the proapoptotic mouse protein BAX physiologically resembles PCD associated with defense-related HR (Lacomme and Santa Cruz, 1999). As a result, the ability to suppress BAX-triggered PCD has proven a valuable initial screen for pathogen effectors capable of suppressing defense-associated PCD (Abramovitch et al., 2003; Dou et al., 2008; Wang et al., 2011a).

As a first step toward elucidating the molecular basis for colonization of apple by *V. mali*, we identified and characterized the *V. mali* repertoire of candidate effectors. Out of 70 randomly selected candidate effector proteins (CEPs), seven were able to suppress BAX-induced PCD in *Nicotiana benthamiana.* Furthermore, functional characterization of VmEP1 revealed that this candidate effector is a true virulence factor of *V. mali*. Taken together, the candidate effectors identified here provide valuable information for the study of the *V. mali*–apple interaction. Suppression of effector-triggered immunity (ETI) by this necrotrophic pathogen provides new insights into the interaction of a necrotrophic fungus and its host plant.

### Materials and Methods

### Strains and Culture Conditions

*Valsa mali* strain 03–8 is a stock culture of the Laboratory of Integrated Management of Plant Diseases at the College of Plant Protection, Northwest A&F University, Yangling, PRC (Available on request). Cultures were maintained on potato dextrose agar (PDA) medium with a layer of cellophane at 25◦C in the dark. *Agrobacterium tumefaciens* strain GV3101 used for molecular cloning and agro-infiltration experiments was cultured on Luria-Bertani medium at 28◦C. *N. benthamiana* plants were maintained at 25◦C with 16 h illumination per day.

### Construction of *V. mali* cDNA Libraries

Total RNA was extracted using the RNeasy Micro kit (Qiagen, Shenzhen, PRC) according to the manufacturer's protocol from (a) *V. mali* mycelium grown on PDA medium for 3 days, and (b) apple twigs of *Malus domestica* borkh. cv. 'Fuji' inoculated with *V. mali* mycelium [3 days post inoculation (dpi)]. First strand cDNA was synthesized by the RevertAidTM First Strand cDNA Synthesis Kit (Fermentas, Shenzhen, PRC) according to the manufacturer's protocol.

### Sequence Analyses

The secretome of *V. mali* was obtained from the sequenced genome in our previous study (Yin et al., 2015). The Whole-Genome Shotgun project for *V. mali* has been deposited at DDBJ/EMBL/GenBank under the accession JUIY01000000. CEPs were defined as extracellular proteins with no known function (*e*-value > 1e–5). Cysteine content was calculated using the pepstats program from the EMBOSS package (http://emboss. bioinformatics.nl/). Markov clustering analysis was performed using tribe-MCL (Enright et al., 2002). The known effector motifs RXLR (Kale et al., 2010), [L/I]xAR (Godfrey et al., 2010), YxSL[R/K] (Saunders et al., 2012), [R/K]VY[L/I]R (Ridout et al., 2006), and [Y/F/W]xC (Dodds et al., 2009) were searched for using the fuzzpro program from the EMBOSS package. Nuclear localization signals (NLSs) were predicted with NLStradamus (Ba et al., 2009). *De novo* motif search was performed using MEME (Bailey et al., 2009).

### Plasmid Constructs

Targeted genes were amplified from the cDNA library using high-fidelity TransStart<sup>R</sup> FastPfu DNA polymerase (TransStart, Beijing, PRC). For *A. tumefaciens* infiltration assays in *N. benthamiana*, PCR products were digested with the appropriate restriction enzymes and ligated into the PVX vector pGR106 (Giraldo and Valent, 2013). Primers used for PCR are listed in Supplementary Table S1. All plasmids were confirmed by sequencing.

### *Agrobacterium tumefaciens* Infiltration Assays

Agro-infiltration assays were carried out following the previously described procedure (Dou et al., 2008). For transient expression, *A. tumefaciens* strain GV3101 carrying an expression plasmid (pGR106:effector) was grown in LB medium containing kanamycin (50 µg/ml). Cells were resuspended in 10 mM MgCl2 (pH 5.6) and cell density was adjusted to an OD600 = 0.4. Bacteria were infiltrated through a little nick with a syringe to the upper leaves of 4-week-old *N. benthamiana* plants. *A. tumefaciens* cells carrying the *Bax* gene (pGR106:*Bax*) were infiltrated into the same site just subsequently, or 16 h later. As control, plants were infiltrated with *A. tumefaciens* carrying an empty PVX vector. Cell death symptoms were evaluated and photographed 3–4 days past infiltration. Each assay was performed in triplicate.

#### RNA Extraction and Transcript Level Analysis

To measure transcript level of candidate effector genes by qRT-PCR, apple tissue infected with *V. mali* strain 03–8 was sampled at 0, 6, 12, 24, 36, and 48 hours post inoculation (hpi). Total RNA was extracted using the RNAeasy<sup>R</sup> Plant mini kit (Qiagen, Shenzhen, PRC) following the recommended protocol. First-strand cDNA was synthesized using an RT-PCR system (Promega, Madison, WI, USA) following the manufacturer's instructions. SYBR green qRT-PCR assays were performed to analyze transcript levels. *V. mali* housekeeping gene *G6PDH* was used as endogenous control (Yin et al., 2013). Data from three biological replicates were used to calculate the mean and standard deviation. Primers used for qRT-PCR were listed in Supplementary Table S3.

### Generation of *VmEP1* Mutants

A typical reaction assembling three components using the hygromycin B phosphotransferase gene (*hph*) as a selective marker. The *hph* gene was amplified with primers, HPH-F (5 - GGCTTGGCTGGAGCTAGTGGAGGTCAA-3 and HPH-R 5 - AACCCGCGGTCGGCATCTACTCTATTC-3 ) from pBIG2RH PH2-GFP-GUS. The upstream (∼1,100 bp) and downstream (∼1,400 bp) flanking sequences of *VmEP1* were amplified using primer pairs VmEP1-1F/2R and VmEP1-3F/4R, respectively. Then the deletion cassette was generated by double-joint PCR as described (Yu et al., 2004), using the primer pair VmEP1-CF/CR. For generating deletion mutants, the *VmEP1* gene-replacement construct was transformed into protoplasts of *V. mali* as previously described (Gao et al., 2011). Putative deletion mutants were verified by PCR using four primer pairs (VmEP1- 5F/6R, VmEP1-7F/H855R, VmEP1-H856F/8R, and H850/H852) to detect target gene (∼850 bp), upstream (∼1,100 bp), and downstream (∼1,400 bp) region, and the *hph* gene (∼750 bp), respectively. Subsequently, deletion mutants were confirmed by Southern blot hybridization using the DIG DNA Labeling and Detection Kit II (Roche, Mannheim, Germany) according to the manufacturer's instructions. Primers used for gene deletion are listed in Supplementary Table S2.

#### Complementation of the *-VmEp1* Mutant

A fragment containing the entire *VmEp1* gene without the termination codon and its promoter (∼1.5 kb) was amplified with primers VmEp1-FL2-F/R (5 -cgactcactatagggcg aattgggtactcaaattggTTTATCTCAATCGCCTCGTT-3 and 5 -ca ccaccccggtgaacagctcctcgcccttgctcacGTCTACCGAACATGTCT GTGG-3 ), and cloned into plasmid pFL2 by the yeast gap repair approach (Bruno et al., 2004; Zhou et al., 2012). The resulting construct, pFL2-VmEp1, was transformed into protoplasts of the -*VmEp1* mutants 74# and 85#. -*VmEp1*/*VmEp1* transformants were confirmed by PCR using the primer pair VmEP1-5F/6R. The -*VmEp1*/*VmEp1* complemented transformants 74#-1 and 85#-1were selected for phenotypic analysis.

### Vegetative Growth, Conidiation, and Pathogenicity of Mutants

Vegetative growth and conidiation was examined at three and 40 days, respectively. For our pathogenicity assay, detached apple twigs of *Malus domestica* borkh. cv. 'Fuji' were inoculated with *VmEP1* deletion mutants, -*VmEp1*/*VmEp1* complementation mutants, and wild type as described (Oliva et al., 2010). All treatments were performed with at least ten replicates, and all experiments were repeated three times. Data were analyzed by Student's *t*-test using the SAS software package (SAS Institute, Cary, NC, USA).

### Epifluorescence Microscopy

To measure the level of colonization by the *VemEP1*-deletion mutants, boundary-zones (∼0.5 cm2) of inoculated leaves were sampled at 60 hpi and treated with 1 M KOH solution for 15 min at 121◦C. After cooling to room temperature, the samples were washed in distilled water twice and then stained in a staining solution (0.067 M K2HPO4 + 0.05% aniline blue) according to Hood and Shew (1996). Leaf samples were examined under a Zeiss epifluorescence microscope (excitation 485 nm, dichronic mirror 510 nm, barrier 520 nm). Wild type strain 03–8 was used as control. All treatments were performed with at least five replicates, and all experiments were repeated three times.

### Results

### Characterization of Candidate Effector Proteins of *V. mali*

Considering that pathogen effector repertoires are typically lineage-specific, we defined CEPs as predicted extracellular proteins with no known function (*e*-value threshold >1e–5). Among the 779 secreted proteins predicted from the *V. mali* genome (Yin et al., 2015), we identified 193 CEPs, of which 101 are *V. mali*-specific. Analysis of amino acid sequences revealed that *V. mali* CEPs have several commonly known properties of effector proteins. They have shorter sequence length relative to non-CEPs, with an average of 233 amino acids, and are also cysteine rich (**Figure 1**). Markov clustering suggests that there is no obvious expansion of *V. mali* CEP families, and most families contain only one member.

Search for conserved motifs in these 193 CEPs showed that 71 of them contain a total of 97 Y/F/WxC motifs, 22 contain 26 L/IxAR motifs, and seven contain a single RxLR motif each.

No additional conserved motifs were identified by *de novo* prediction. In addition, nine CEPs are predicted to contain NLSs.

### CEPs of *V. mali* Suppress BAX-induced PCD in *N. benthamiana*

Phytopathogen effectors often induce a phenotype upon overexpression *in planta*, reflecting their virulence activity. To investigate the function of the CEPs of *V. mali*, we tested the ability of CEPs to induce cell death, or to suppress BAX-induced PCD in *N. benthamiana*. Based on transient over-expression in *N. benthamiana* performed for 70 randomly selected CEPs, seven *V. mali* Effector Proteins (VmEPs) significantly suppressed BAXinduced PCD. Others had little or no effect on suppressing PCD (**Figure 2**). In addition, all 70 CEPs tested could not induce cell death in *N. benthamiana*. Of these 7 VmEPs, VM1G\_05336 is *V. mali*-specific and the others are hypothetical proteins that have homologs in GenBank NR database. Intriguingly, VmEP1 (VM1G\_02400) contained a HeLo domain (**Table 1**) that is exclusively in the fungal kingdom and is similar as other cell death and apoptosis-inducing domains (Fedorova et al., 2005).

### Transcription Level of *V. mali* Candidate Effector Genes

Fungal effector proteins are generally characterized by specific expression during invasion of plant cells. The expression levels of the 7 VmEP genes identified above were assayed by qRT-PCR with housekeeping gene *G6PDH* as control (**Figure 3**). Result showed that 5 of the 7 VmEP genes were up-regulated (fold change >2) during infection. Particularly, the *V. mali*-specific VM1G\_05336 was remarkably induced at 6 and 24 hpi.

### Candidate Effector VmEP1 is a Virulence Factor of *V. mali*

To investigate the function of VmEPs, the putative necrosisinducing protein VmEP1 (VM1G\_02400) and the *V. mali*specific VM1G\_05336 were chosen for further study. Gene deletion was performed using a gene replacement strategy (**Figure 4A**). However, only deletion mutants of *VmEP1* were obtained. Among the resulting 125 hygromycin-resistant transformants of *VmEP1*, two were identified by PCR analysis (**Figure 4B**). These two *VmEp1* deletion mutants (74# and 85#)

*benthamiana* leaves after agro-infiltration. (A–G) Seven VmEPs suppress BAX-induced cell death. (H) Most CEPs, like VM1G\_04728, could not suppress BAX-induced PCD. Leaves were infiltrated with *A. tumefaciens* containing a PVX within the regions indicated by the dashed lines. Photos were taken four to 5 days after the last infiltration. Numbers, for instance 20/27, indicate that 20 out of 27 times infiltrated leaves showed the same symptoms.



FIGURE 3 | Relative expression level of *VmEps* at 0, 6, 12, 24, 36, and 48 hours post inoculation (hpi) using reference gene *G6PDH* for normalization. Results are presented as a mean fold change in relative expression compared to the 0 hpi sampling stage. All experiments were repeated three times. Error bars indicate SEM.

Primer binding sites are indicated by black arrows (see Supplementary Table S2 for primer sequences). (B) Identification of *VmEp1* knockout

(D) Identification of *VmEp1* complementation mutants by PCR analysis using primer pair VmEP1-5F/6R.

were further confirmed by Southern hybridization (**Figure 4C**). Complementation of both mutants using a *VmEp1* expressing plasmid was again confirmed by PCR using the primer pair VmEP1 5F/6R (**Figure 4D**). *VmEp1* deletion mutants showed no effect on vegetative growth and conidiation on PDA medium (**Figure 5**). Intriguingly, pathogenicity assays showed that both deletion mutants had significantly reduced virulence on apple twigs and leaves (**Figure 6**). -*VmEp1*/*VmEp1* complemented mutants 74#-1 and 85#-1 exhibit similar virulence as the wild type isolate 03–8 (**Figure 6**). Results from epifluorescence microscopy

show that deletion mutants have reduced mycelia growth within the leaf compared to the wild type which can be rescued by introducing the complementing plasmids (**Figure 7**). These results indicate that *VmEp1* is undoubtedly involved in virulence of *V. mali*.

### Discussion

In the genome of *V. mali*, we previously identified 193 genes encoding secreted proteins with no known function (Yin et al., 2015). In this study, we identified seven *V. mali* Effector Protein

(VmEP) genes by a functional screen of these proteins, based on a virus *in planta* over-expression system. We showed that transient expression of each of these seven VmEPs significantly suppressed BAX-induced PCD in *N. benthamiana*, five genes of which were up-regulated during infection. While the ability to suppress BAXtriggered PCD is a valuable initial screen for pathogen effectors (Wang et al., 2011a), for it physiologically resembles defenserelated HR (Lacomme and Santa Cruz, 1999), this result strongly suggests that *V. mali* expresses secreted proteins during plant infection, which can suppress effector-triggered plant immunity (ETI) responses in the host.

treatments were performed with at least five replicates, and all experiments were repeated three times. Bar = 10 µm.

Effectors of necrotrophic pathogens interact with their host in a gene-for-gene relationship to initiate disease, which leads to ETS (Jones and Dangl, 2006; Oliver and Solomon, 2010; Wang et al., 2014). The host specific toxins secreted by *Cochliobolus victoriae* are translocated into plant cells to interact with specific corresponding host proteins to promote host cell death (Donofrio and Wilson, 2014; Kuhn and Panstruga, 2014). Likewise, the proteinaceous host specific toxin *Ptr*ToxA produced by *P. triticirepentis* targets a host chloroplastic protein, which disrupts the photosynthetic capacity and triggers PCD (Stergiopoulos et al., 2013; Petre and Kamoun, 2014). Indeed, most of the identified effectors of necrotrophic pathogens are found to promote host cell death (Wang et al., 2014). However, because we screened these effector candidates on non-host *N. benthamiana*, seven out of 70 examined candidate effectors of *V. mali* suppress BAXinduced PCD and not a single one was found to induce PCD in *N. benthamiana*. Considering that many effectors of necrotrophic pathogen are host-specific toxins, it is necessary to examine ability of these CEPs to induce necrosis on hosts to determine whether they could be proteinaceous toxins.

Recently, the classic necrotrophic fungus *S. sclerotiorum* was found to have a biotrophic phase at the very early stage of infection (Kabbage et al., 2013). Oxalic acid secreted by this fungus can suppress host autophagy which is a defense response in its interaction with its host. Indeed, *S. sclerotiorum* suppresses the defense-related autophagy/PCD at an early stage of infection, and promotes disease-related apoptosis/PCD after infection established (Kabbage et al., 2013). This means that not all forms of cell deaths are equivalent. One type of cell death may be suppressed by a necrotrophic pathogen to inhibit plant defense responses, while another may be promoted to facilitate disease progress. The effectors identified in this study in *V. mali* probably participate in the suppression of defenserelated PCD. Nevertheless, the interaction targets of these effector proteins in apple need to be determined to verify their exact roles in *V. mali*–apple interaction. Especially for VmEP1, because targeted deletion of *VmEP1* gene results in a significant reduction in virulence.

As a necrotrophic fungus, *V. mali* is thought to contain virulence factors that induce cell death. Particularly, the impressive arsenal of plant cell wall degrading enzymes and secondary metabolites may account for the severe necrosis observed on apple bark (Yin et al., 2015). Deletion of six pectinase genes significantly reduced virulence of *V. mali* (Yin et al., 2015). Phytotoxic small polypeptides secreted by the closely related peach canker pathogens *Leucostoma persoonii* and *L. cincta* can only induce stem necrosis on host plants, reflecting a host specific characteristic (Svircev et al., 1991). In addition, *V. mali* also possesses homologs of necrosis-inducing factors including NPP1, Ecp2, and Epl (Yin et al., 2015). Likewise, these potential virulence factors also need to be functionally verified.

### Author Contributions

ZL and ZY contributed equally to this work as first authors. LH designed and managed the project. ZL, YF, and MX performed the experimental work. ZY performed all the computational analysis. ZL, ZY, ZK, and LH wrote the paper.

### Acknowledgments

This work was financially supported by the National Natural Science Foundation of China (No. 31471732, 31171796), the Program for Agriculture (nyhyzx201203034- 03) and the 111 Project (B07049). Authors wish to thank

### References


Fengming Song (Zhejiang University, China) for providing the pBIG2RHPH2-GFP-GUS plasmid and Dr. Ralf T. Voegele at Universität Hohenheim for proofreading this manuscript.

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015.00579

*sclerotiorum*. *PLoS Pathog.* 9:e1003287. doi: 10.1371/journal.ppat.10 03287


**Conflict of Interest Statement:** The review editor Weixing Shan declares that, despite being affiliated with the same institution as authors Zhiyuan Yin, Xu Ming, and ZhenPeng Li, the review process was carried out objectively. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Li, Yin, Fan, Xu, Kang and Huang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Repeat-containing protein effectors of plant-associated organisms

Carl H. Mesarich1, 2 \*, Joanna K. Bowen<sup>2</sup> , Cyril Hamiaux <sup>3</sup> and Matthew D. Templeton1, 2

*<sup>1</sup> School of Biological Sciences, The University of Auckland, Auckland, New Zealand, <sup>2</sup> Host–Microbe Interactions, Bioprotection, The New Zealand Institute for Plant & Food Research Ltd, Auckland, New Zealand, <sup>3</sup> Human Responses, The New Zealand Institute for Plant & Food Research Limited, Auckland, New Zealand*

Many plant-associated organisms, including microbes, nematodes, and insects, deliver effector proteins into the apoplast, vascular tissue, or cell cytoplasm of their prospective hosts. These effectors function to promote colonization, typically by altering host physiology or by modulating host immune responses. The same effectors however, can also trigger host immunity in the presence of cognate host immune receptor proteins, and thus prevent colonization. To circumvent effector-triggered immunity, or to further enhance host colonization, plant-associated organisms often rely on adaptive effector evolution. In recent years, it has become increasingly apparent that several effectors of plant-associated organisms are repeat-containing proteins (RCPs) that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. In this review, we highlight the diverse roles that these repeat domains play in RCP effector function. We also draw attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity. The aim of this review is to increase the profile of RCP effectors from plant-associated organisms.

#### Edited by:

*Maryam Rafiqi, Computomics GmbH, Germany*

#### Reviewed by:

*Ralph Panstruga, RWTH Aachen University, Germany Francine Govers, Wageningen University, Netherlands*

> \*Correspondence: *Carl H. Mesarich carl.mesarich@gmail.com*

#### Specialty section:

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

Received: *03 August 2015* Accepted: *01 October 2015* Published: *21 October 2015*

#### Citation:

*Mesarich CH, Bowen JK, Hamiaux C and Templeton MD (2015) Repeat-containing protein effectors of plant-associated organisms. Front. Plant Sci. 6:872. doi: 10.3389/fpls.2015.00872* Keywords: repeat-containing protein effectors, plant-associated organisms, microbes, nematodes, insects

## EFFECTORS OF PLANT-ASSOCIATED ORGANISMS

Diverse plant-associated organisms, including bacteria, fungi, oomycetes, nematodes, and insects, secrete or inject a suite of proteins, termed effectors, into the tissues of their prospective hosts (Bozkurt et al., 2012; Deslandes and Rivas, 2012; Mitchum et al., 2013; Jaouannet et al., 2014; Lo Presti et al., 2015). These effectors, which localize to the host apoplast, or are targeted to various plant cell compartments, function to promote colonization, typically by altering host physiology or by modulating host immune responses (Hogenhout et al., 2009; Win et al., 2012a). Certain host plants however, have evolved immune receptor proteins that are capable of directly or indirectly recognizing one or more of these effectors or their modulated host targets respectively, to trigger immune responses that prevent colonization (Böhm et al., 2014; Cui et al., 2015). To circumvent these recognition events, or to provide novel, altered, or extended effector functionalities that further enhance the colonization of susceptible hosts, plant-associated organisms often rely on effector modification through adaptive evolution, as driven by host-imposed selection pressure (e.g., Stergiopoulos et al., 2007; Win et al., 2007; Dong et al., 2014).

### SEVERAL EFFECTORS OF PLANT-ASSOCIATED ORGANISMS ARE REPEAT-CONTAINING PROTEINS

Proteins that make up the effector repertoires of plant-associated organisms possess a range of different features. For example, most carry a signal peptide for targeted secretion or delivery to the host environment. In addition, many effectors, particularly those of fungi, are small and/or cysteine-rich, while others may possess a nuclear localization signal (NLS) or, as shown for several effectors of filamentous plant-associated organisms, a conserved effector motif (Dou and Zhou, 2012). The secretomes, and thus effector repertoires, of plant-associated organisms also differ in their proportion of repeat-containing proteins (RCPs). This is best illustrated by the predicted secretomes of Melampsora larici-populina and Puccinia graminis f. sp. tritici, the fungal pathogens responsible for poplar leaf rust and wheat stem rust, respectively. In a study by Saunders et al. (2012), it was revealed that of the 1549 secreted proteins predicted from the proteome of M. larici-populina, 493 (∼32%) were RCPs. In contrast, no RCPs could be identified among the 1852 secreted proteins predicted from the proteome of P. graminis f. sp. tritici (Saunders et al., 2012). As such, RCP effectors are expected to play an important role in promoting the colonization of some, but not all, plant-associated organisms. This is supported by the fact that several known effectors of plant-associated organisms are RCPs (**Tables 1**–**3**). For the purpose of this review, we define RCPs as those proteins that carry two or more copies of a tandemly or non-tandemly duplicated sequence or structural motif that is at least five amino acid residues in length.

Various bioinformatic tools, databases, and servers are available for the detection of repeat domains in protein sequences (reviewed in Kajava, 2012; Luo and Nijveen, 2014). Typically, perfect (identical) or imperfect (near-identical) sequence repeats are easily detected, as are those repeats with homology to known functional domains. However, the detection of highly degenerate (divergent) sequence repeats, which carry amino acid substitutions, insertions, or deletions that have accumulated during evolution, is often more difficult. In some instances, degenerate sequence repeats may only be identified following an analysis of protein tertiary structure, for which servers are again available (see Kajava, 2012). Indeed, this has been the case for several effectors of plant-associated organisms. As an example of this, structural characterization of both the AvrM-A effector from Melampsora lini, a fungal rust pathogen of flax, as well as AvrPtoB, a type III effector from Pseudomonas syringae pv. tomato (Pst), the bacterial speck pathogen of tomato, revealed the presence of two four-helix bundle repeats (**Figures 1A,B**, **2B**) (Dong et al., 2009; Cheng et al., 2011; Ve et al., 2013). Bioinformatic tools though, have been shown to play a key role in the identification of certain highly degenerate repeat domains. For example, Jiang et al. (2008) used the MEME algorithm (Bailey et al., 2015), together with hidden Markov model (HMM) searches, to identify RXLR effectors from two plant-associated oomycete species (Phytophthora sojae and Phytophthora ramorum) that carry conserved, but highly degenerate, C-terminal WYL motifs, or WY motifs, which often form tandem repeats. In oomycete plant pathogens, RXLR effectors represent one of the largest and most diverse effector families (Jiang et al., 2008). Jiang et al. (2008) demonstrated that approximately half of the abovementioned RXLR effectors possess WYL motifs, with 30% possessing between two and eight repeated WYL modules. A comparison of RXLR effector tertiary structures has since revealed that a three-helix bundle fold, termed the WY domain, is the basic structural unit adopted by the WY motifs (Boutemy et al., 2011; Win et al., 2012b). One of these structurally characterized RXLR effectors, ATR1, which is produced by Hyaloperonospora arabidopsidis, the oomycete downy mildew pathogen of Arabidopsis thaliana, carries two five-helix bundle WY domain repeats (**Figure 2A**) (Chou et al., 2011). Notably though, this tandem repeat was only identified upon structural characterization of ATR1, with a prior HMMbased bioinformatic screen identifying only one of the two WY domains present in this effector (Boutemy et al., 2011). This example therefore highlights the difficulties associated with identifying highly degenerate repeat domains. More recently though, Ye et al. (2015) have demonstrated that WYL motifs have highly conserved α-helical secondary structures. Furthermore, the few amino acid residues that are conserved between such WYL or WY motifs have been shown to be hydrophobic, occupying buried positions within these α-helices (Boutemy et al., 2011; Chou et al., 2011; Win et al., 2012b; Ye et al., 2015). Thus, an integrated approach, combining HMM screens, together with secondary structure predictions and surface accessibility profiles, can be employed to identify the degenerate, and often repeated, WYL or WY motifs present in oomycete RXLR effectors.

### REPEAT DOMAINS PLAY DIVERSE ROLES IN RCP EFFECTOR FUNCTION

Collectively, repeat domains play diverse roles in the biological function of RCP effectors from plant-associated organisms (**Tables 1**–**3**). In brief, these roles can range from directing effector localization, to mediating interaction with one or more specific RNA, DNA, protein, or carbohydrate targets, to providing effector stability. It is becoming increasingly clear that these roles are intimately linked to the composition or architecture of the repeat domains that perform them. For example, as shown in **Figures 1**, **2**, the repeat domain of an RCP effector, like that of many other RCPs (Grove et al., 2008), frequently exhibits an extended modular, non-globular architecture. This in turn provides the effector with a larger surface area-to-volume ratio than that of a typical globular protein of equivalent amino acid length, a feature that is particularly well-suited to certain functional roles. This is elegantly illustrated by the transcription activator-like (TAL) effectors of the bacterial plant pathogens, Xanthomonas spp., which interact with host DNA in the plant cell nucleus to hijack host genes (by transcriptional activation) whose expression


bacteria.


TABLE 1 | Continued


*cPPR and ankyrin repeats were predicted using TPRpred* 

*(http://toolkit.tuebingen.mpg.de/tprpred)*

 *and InterProScan*

 *5* 

*(http://www.ebi.ac.uk/Tools/pfa/iprscan5/),*

 *respectively.*



TABLE

2


Continued


promotes bacterial growth and/or disease symptom formation (Boch and Bonas, 2010). TAL effectors carry a central repeat domain that possesses up to 33.5 near-identical tandem repeats of 30–42 amino acids in length, followed by a carboxyl (C) terminal region that contains both NLSs and a eukaryotic acidic activation domain (Boch and Bonas, 2010). As shown for PthXo1, a TAL effector from the rice blight pathogen, Xanthomonas oryzae pv. oryzae, the central repeat domain forms an extended surface area of interaction with host DNA, in which the repeat domain adopts an α-solenoid structure that physically wraps around the DNA molecule (**Figure 1C**) (Deng et al., 2012; Mak et al., 2012). More specifically, the individual repeat units mediate the direct binding of single consecutive nucleotide bases within the promoter sequence (i.e., the effector-binding element; EBE) of a host gene. This specificity is governed by amino acid residues 12 and 13 of each repeat unit, termed the repeat-variable di-residues (RVDs), which make specific contact with the host DNA and play a stabilizing role, respectively (Boch et al., 2009; Moscou and Bogdanove, 2009). The functional relevance of this repeat structure was reinforced by artificial TAL effectors carrying a variable number of repeat units. Boch et al. (2009) were able to show that a minimum of 6.5 repeat units are necessary for EBE recognition and subsequent transcriptional activation, while 10.5 or more repeat units are required for strong target gene expression.

An extended modular, non-globular architecture, as adopted by the repeat domains of many RCPs, is also particularly wellsuited to mediating various protein–protein interactions (Grove et al., 2008). Indeed, many classes of repeat domains serve as scaffolds or adaptors. When performing this role, different repeat units, or regions of a repeat unit, may organize multiple proteins into functional complexes. Alternatively, interactions between different proteins, or between proteins and other functional domains present in the RCP, may be facilitated (Grove et al., 2008). Importantly, these roles are supported by the inherent conformational flexibility of the repeat domain, as mediated through for instance, a flexible hydrophobic core (Kappel et al., 2010), or flexible inter-repeat hinges, loops, or linkers, similar to those found in Cin1, a candidate effector of unknown function from the apple scab fungus, Venturia inaequalis (**Figure 2C**) (Mesarich et al., 2012). Domains that may perform such a role include, for example, those comprising ankyrin or HEAT/armadillo repeats, which, like the repeat domains present in TAL effectors, adopt an α-solenoid-type architecture, as well as leucine-rich repeats (LRRs), which adopt an α/β-solenoidlike or horseshoe-type fold (Kajava, 2012). Notably, several effectors from plant-associated organisms carry such repeat domains. For example, effectors of the bacterial wilt pathogen, Ralstonia solanacearum, including RipAP, RipBB, RipBC, and RipY, carry ankyrin repeats (Peeters et al., 2013), while other effectors of R. solanacearum and Xanthomonas spp., including RipS1–RipS8, XopAD, and XopN, carry HEAT/armadillo repeats (White et al., 2009; Peeters et al., 2013). In addition, several effectors from R. solanacearum (RipG1–RipG7), Xanthomonas spp. (XopAC, XopAE, and XopL), and the gall-forming pest of cereals, Mayetiola destructor (SSGP-71 family), carry LRRs

*aProtein length in amino acids (aa).*

*bRepeat hydropathy profiles were determined using the Expasy ProtScale server* 

*cCin1 is a candidate effector of V. inaequalis (Kucheryava*

*dThe length of SP7 remains unclear due to differential transcript splicing, with five versions of the mRNA transcript found at different developmental*

*(http://web.expasy.org/protscale/),*

 *et al., 2008).*

 *with default server settings.*

 *stages (Kloppholz et al., 2011).*

TABLE 2 | Continued


*cThe protein length varies between members of the RCP effector family.*

**129**

FIGURE 1 | Primary and tertiary structures of repeat domains from RCP effectors of plant-associated bacteria. (A) Crystal structure of repeat unit one from the AvrPtoB effector of the tomato bacterial speck pathogen, *Pseudomonas syringae* pv. *tomato* (*Pst*), in complex with the tomato Pto kinase (Protein Data Bank [PDB] code 3HGK; Dong et al., 2009). (B) Nuclear magnetic resonance (NMR) structure of repeat unit two from AvrPtoB of *Pst* in complex with the BAK1 kinase domain from *Arabidopsis thaliana* (3TL8; Cheng et al., 2011). Note that in (A), AvrPtoB repeat unit one interacts with the Pto kinase in a different orientation to that of AvrPtoB repeat unit two with the BAK1 kinase domain in (B). (C) Crystal structure of the repeat domain from the PthXo1 transcription activator-like (TAL) effector of the bacterial rice pathogen, *Xanthomonas oryzae* pv. *oryzae*, bound to its natural DNA target (36 bp). The repeats pack together to form a left-handed superhelix (α-solenoid) that wraps around the DNA molecule (3UGM; Mak et al., 2012). (D) Crystal structure of the N-terminal leucine-rich repeat (LRR) domain from the XopL effector of the bacterial leaf spot pathogen of pepper and tomato, *Xanthomonas euvesicatoria* (4FCG; Singer et al., 2013). Structural coordinate files were downloaded from the Research Collaboratory for Structural Bioinformatics (RCSB) PDB (http://www.rcsb.org/pdb/home/home.do). Alternating repeat units are colored blue, slate, and cyan, respectively. Non-repetitive sequence is colored gray. The molecular surface of Pto kinase in (A) and BAK1 kinase domain in (B) are *(Continued)*

#### FIGURE 1 | Continued

shown in gray, while the DNA molecule in (C) is colored red. An amino acid sequence alignment detailing the primary structure of each RCP effector repeat domain is shown to the right of each tertiary structure (as based on that presented in each tertiary structure). Repeat (R) units are numbered according to their position in the RCP effector. The start and end position of each repeat unit in the full-length RCP effector is shown. Conserved (\*) and strongly similar (:) amino acid residues shared between repeat units are shown below the sequence alignment (based on full-length repeat units only). The figure was prepared using PyMol (https://www.pymol.org/) and Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/).

sequence is colored gray. The chitin tetramer in (D) is colored red. An amino acid sequence alignment detailing the primary structure of each RCP effector repeat domain is shown to the right of each tertiary structure (as based on that presented in each tertiary structure). Repeat (R) units are numbered according to their position in the RCP effector. The start and end position of each repeat unit in the full-length RCP effector is shown. Conserved (\*) and strongly similar (:) amino acid residues shared between repeat units are shown below the sequence alignment. The figure was prepared using PyMol (https://www.pymol.org/) and Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/). Structure-based sequence alignments of repeat units from ATR1 and AvrM-A are adapted from Chou et al. (2011) and Ve et al. (2013), respectively.

(**Figure 1D**) (Xu et al., 2008; White et al., 2009; Peeters et al., 2013; Zhao et al., 2015).

Of the effectors mentioned above, one of the best characterized to date is XopN, a type III effector widely conserved across Xanthomonas spp. that suppresses host immune responses (Roden et al., 2004; Kim et al., 2009; Taylor et al., 2012). XopN from the leaf spot pathogen of pepper and tomato, Xanthomonas euvesicatoria, carries seven tandem HEAT/armadillo-like repeats (Roden et al., 2004). This effector interacts with the atypical LRR-receptor-like kinase (RLK), TARK1 (via the XopN nonrepetitive N-terminal region), and the 14-3-3 isoform, TFT1 (via the XopN C-terminal HEAT/armadillo-like repeats), two positive regulators of host immunity in tomato, near and at the plant cytoplasmic–plasma membrane (PM) interface, respectively (Kim et al., 2009; Taylor et al., 2012). In addition to these binary interactions, XopN also engages in tertiary interactions with TARK1 and TFT1 at the plant cytoplasmic– PM interface (Kim et al., 2009; Taylor et al., 2012). Here XopN is expected to promote and/or stabilize TARK1/TFT1 complex formation by functioning as a protein bridge or molecular scaffold (Taylor et al., 2012). Currently however, it remains unclear how these interactions suppress host immune responses. One possibility is that XopN interferes with TARK1 protein– protein interactions, stability and/or signal transduction, and in the case of TFT1, client interactions (Kim et al., 2009; Taylor et al., 2012). Another possibility, given that TARK1 and TFT1 do not interact in the absence of XopN, is that the binding of this effector to these proteins in either binary or tertiary complexes leads to the sequestration of inactive immune complexes at or near the plant cytoplasmic–PM interface, thereby preventing downstream immune signaling (Taylor et al., 2012).

Other repeat domain architectures and compositions have been shown to play an important role in the function of RCP effectors from plant-associated organisms. One such example is provided by Ecp6, an effector of the tomato leaf mold fungus, Cladosporium fulvum, which carries three lysin motif (LysM) domains that each adopt a βααβ-fold as part of an overall globular structure (**Figure 2D**) (Bolton et al., 2008; Sánchez-Vallet et al., 2013). Ecp6 molecules sequester chitin oligosaccharides released from the cell wall of C. fulvum during infection. In doing so, Ecp6 prevents the recognition of these oligosaccharides by host chitin immune receptors, thereby perturbing chitintriggered immunity (de Jonge et al., 2010). More specifically, two of the three LysM domains, LysM1, and LysM3, undergo chitin-induced dimerization, in which the domains cooperate to produce a deeply buried chitin-binding groove (**Figure 2D**). This groove binds a single chitin oligosaccharide with ultrahigh affinity, and is sufficient to out-compete host chitin immune receptors for chitin binding (Sánchez-Vallet et al., 2013). Another example is provided by GrCLE1, an effector of the potato cyst nematode, Globodera rostochiensis (Lu et al., 2009). GrCLE1 possesses a variable domain, followed by a C-terminal region with four 12-amino acid repeats that have similarity to plant CLAVATA3 (CLV3)/endosperm surrounding region (ESR) related (CLE) peptides (Lu et al., 2009). In plants, endogenous CLE protein precursors are post-translationally modified and proteolytically processed to give bioactive CLE peptides. These peptides then function as hormones that interact with various extracellular plant receptors to regulate many aspects of plant growth and development (Kucukoglu and Nilsson, 2015). Like plant CLE protein precursors, GrCLE1 is post-translationally modified and proteolytically processed by plant machinery to produce bioactive CLE-like peptides (Guo et al., 2011; Chen et al., 2015). These peptides then function as endogenous plant CLE peptide mimics, directly binding plant RLKs, including CLV2, BAM1, and BAM2, to alter plant root growth and development for the promotion of plant parasitism (Lu et al., 2009; Guo et al., 2011; Chen et al., 2015).

### SEVERAL RCPS OF PLANT-ASSOCIATED ORGANISMS ARE SURFACE-ASSOCIATED

An important point to stress is that several RCPs of plantassociated organisms are surface-associated. That is, they are attached to, or are integrated into, the cell wall and/or PM through various covalent/non-covalent linkages or transmembrane domains, and are at least partially exposed to the extracellular environment. Although not classified as typical secreted effectors, a number of these surface-associated RCPs, and more specifically their repeat domains, have been shown or are hypothesized to play a role in interactions between plant-associated organisms and their hosts (e.g., Görnhardt et al., 2000; Robold and Hardham, 2005; Lanver et al., 2010; Pradhan et al., 2012). An example is provided by CBEL, a cell wall glycoprotein from Phytophthora parasitica var. nicotianae (Ppn), the oomycete root pathogen responsible for black shank disease of tobacco (Nicotiana tabacum) (Séjalon-Delmas et al., 1997; Villalba Mateos et al., 1997). CBEL possesses two repeats, each comprising a carbohydrate-binding module family 1 (CBM1)/fungal-type cellulose-binding domain (CBD) attached to a PAN/APPLE domain (Séjalon-Delmas et al., 1997; Villalba Mateos et al., 1997). Functional analyses have determined that these CBDs play a role in the adhesion of Ppn mycelia to cellulosic substrates, including plant cell walls, and in the organized deposition of the Ppn cell wall polysaccharide, βglucan (Villalba Mateos et al., 1997; Gaulin et al., 2002, 2006). Interestingly, CBEL also elicits strong host immune responses when infiltrated into tobacco (Villalba Mateos et al., 1997), as well as various non-host plants, including A. thaliana (Khatib et al., 2004; Gaulin et al., 2006). These responses are dependent upon the binding of CBEL to the plant cell wall, as mediated through the CBDs (Gaulin et al., 2006). A second example is provided by Rep1 of the corn smut fungus, Ustilago maydis, which carries 12 mostly tandem repeats of 34–55 amino acids in length (Wösten et al., 1996). These repeats, which carry Kex2 recognition sites, are processed in the secretory pathway to 11 repellent peptides that form rigid surface-active amyloidlike fibrils at the hyphal surface, and play a role in cellular attachment to hydrophobic surfaces (e.g., the plant surface) and in the formation of aerial hyphae (Wösten et al., 1996; Teertstra et al., 2006, 2009; Müller et al., 2008; Lanver et al., 2014).

### REPEAT DOMAINS MAY CONTRIBUTE TO THE ADAPTIVE EVOLUTION OF RCP EFFECTORS

Repeat domains can evolve in several different ways, including through changes in repeat unit number or order, as well as through amino acid substitutions or insertions/deletions (indels) in repeat units and/or associated interconnecting loop/linker regions. Changes in number or order, particularly for those repeat units encoded by long nucleotide sequences (≥10 nucleotides in length), likely evolve through intra- and intergenic recombination events (Richard and Pâques, 2000). As shown in other systems, the mutation rates associated with these changes can be orders of magnitude greater than those associated with point mutations, accelerating the evolution of the coding sequence to which they belong (reviewed in Gemayel et al., 2010). Indeed, repeat unit number and/or order has commonly been shown to vary between RCP effectors and RCP effector candidates of individuals, strains, or isolates of the same species or pathovar of plant-associated organism (e.g., Allen et al., 2004; Heuer et al., 2007; Jelenska et al., 2007; Kucheryava et al., 2008; Aggarwal et al., 2014). Changes in repeat unit number have also been shown to accompany the evolutionary paths of certain effector families from plant-associated organisms (e.g., Goss et al., 2013). Furthermore, chimeric RCP effectors, resulting from a recombination event between homologous repeat domains, have been reported (e.g., Yang et al., 2005), a finding that is not surprising, given the high number of RCP effectors that belong to multi-protein families (**Tables 1**–**3**). Although generally not as quick to accumulate, amino acid substitutions, and indels also play an important role in generating sequence diversity within a repeat domain. However, these types of modification only occur following a duplication event. Again, such sequence variation has commonly been found to occur between the repeat units of RCP effectors or RCP effector candidates (see imperfect or degenerate repeat units listed in **Tables 1**–**3**), as well as between the repeat domains of RCP effectors and RCP effector candidates from individuals, strains, or isolates of the same species or pathovar of plant-associated organism (e.g., Kucheryava et al., 2008; Chou et al., 2011; Ve et al., 2013).

Of what relevance could this repeat domain variability be to plant-associated organisms? In industrial and animal-pathogenic yeasts, alterations to the repeat unit number, and/or order of surface-associated RCPs, termed adhesins, have been shown to impart changes in adhesion phenotype, which may permit the rapid adaptation of these organisms to different substrates and host tissues, respectively (reviewed in Verstrepen and Fink, 2009). Furthermore, variability in the repeat domains of RCPs has been linked to the evasion of host immune responses in animal systems (e.g., Madoff et al., 1996; Mendes et al., 2013). In plant-associated organisms, the first indication that repeat domain variability could confer RCP effectors with an adaptive advantage, by providing a source of functional diversity, flexibility, and/or a means of evading host recognition, was provided by the experimental manipulation of AvrBs3, a TAL effector from X. euvesicatoria (Herbers et al., 1992). Typically, in a compatible interaction with pepper plants, AvrBs3 transcriptionally activates UPA20, a host gene that encodes a basic helix-loop-helix transcription factor, to trigger plant cell hypertrophy (Marois et al., 2002; Kay et al., 2007). However, in an incompatible interaction, AvrBs3 transcriptionally activates Bs3, a pepper gene that encodes an executor resistance protein with homology to flavin monooxygenases, to trigger host immunity (Römer et al., 2007, 2009). To dissect the molecular basis of Bs3 dependent immunity, Herbers et al. (1992) generated random deletion derivatives of AvrBs3 that differed in their repeat unit number. While most AvrBs3 deletion derivatives lost their ability to trigger Bs3-dependent immunity, others gained a new host specificity, triggering immunity in pepper plants carrying Bs3- E, an allele of Bs3 (Herbers et al., 1992). This research, which was subsequently confirmed by repeat domain swaps between other TAL effectors (e.g., Yang et al., 2005), demonstrated that it is the order, and thus the sequence, of TAL repeat units that determines host specificity. In addition, this research raised the possibility that recombination within or between the repeat domains of TAL effectors could produce novel effectors capable of activating different host genes (and thus promoting different host interaction phenotypes) as a consequence of their altered DNA recognition specificities. Indeed, evidence for inter- and intra-genic recombination events between TAL effectors has since been provided (Yang and Gabriel, 1995; Yang et al., 2005).

Aside from those present in TAL effectors, other repeat domains have been implicated in the adaptive evolution of RCP effectors from plant-associated organisms. An example is provided by the hypervariable (Gp-HYP) effectors of the potato cyst nematode, Globodera pallida, which are targeted to the host apoplast throughout biotrophy, and are required for successful root colonization (Eves-van den Akker et al., 2014). Gp-HYP effectors, which possess several conserved regions and a central repeat domain, are encoded by a large and incredibly complex gene family. Based on repeat domain amino acid sequence, these effectors can be assigned to one of three subfamilies (Gp-HYP-1, -2, and -3), with members of Gp-HYP-1 and -3 demonstrating high variability in the number, sequence, and order of their tandem repeats (Eves-van den Akker et al., 2014). Notably, Gp-HYP genes exhibit unparalleled diversity between individuals of the same population, with no two nematodes possessing the same genetic complement of Gp-HYP-1 and -3 genes. While it remains unclear what functional role the Gp-HYP repeat domains play in the context of plant parasitism by G. pallida, it has been suggested that their variability may reflect functional diversity, possibly in specificity of ligand binding. It has also been suggested that this variability may reflect the need to evade host recognition, possibly providing an explanation as to why breeding broad-spectrum resistance against this nematode has been so difficult (Eves-van den Akker et al., 2014). In another example, it has been suggested that the duplication and subsequent sequence diversification of CLE-like repeats present in the GrCLE effectors of G. rostochiensis may represent an important mechanism for generating functional diversity required for host parasitism. This is based on the finding that the ectopic over-expression of different GrCLE RCP effectors in A. thaliana leads to a wide range of plant phenotypes (Lu et al., 2009).

For several RCP effectors, including ATR1 of H. arabidopsidis (and other RXLR effectors from plant-pathogenic oomycetes), as well as AvrM-A of M. lini, and AvrPtoB of Pst, sequence diversification has been shown to play a particularly important role in driving repeat domain evolution, with the repeat units present in these effectors lacking significant amino acid sequence homology (Jiang et al., 2008; Dong et al., 2009; Chou et al., 2011; Ve et al., 2013). Instead, typically only those amino acid residues required for maintenance or stabilization of the overall tertiary fold or structural core have remained conserved or physicochemically similar between repeat units (Cheng et al., 2011; Chou et al., 2011; Ve et al., 2013). This in turn has provided these effectors with a conserved structural framework for rapid diversification, a feature that may promote functional diversity, flexibility, and/or a means of evading host recognition. Certainly, the repeat units of AvrPtoB provide an excellent example of functional flexibility. As mentioned previously, the N terminus and central region of this effector each carry a single repeat unit that adopts a four-helix bundle fold (repeat units one and two, respectively; **Figures 1A,B**), while the C terminus carries a Ubox-type E3 ubiquitin ligase domain (Abramovitch et al., 2006; Janjusevic et al., 2006; Dong et al., 2009; Cheng et al., 2011). Remarkably, both repeat units play distinct and multiple roles in modulating host immune responses. For example, repeat units one and two bind and inhibit the kinase domain of the PMlocalized host LysM-RLK and LRR-RLK immune receptors, Bti9 and BAK1, respectively, to suppress immunity-related signaling (Göhre et al., 2008; Shan et al., 2008; Cheng et al., 2011; Zeng et al., 2012). Repeat units one and two also bind the kinase domain of the LysM-RLK CERK1 and LRR-RLK FLS2 immune receptors, respectively, which may promote their ubiquitination and subsequent proteasome-dependent degradation via the AvrPtoB E3 ligase domain (Göhre et al., 2008; Gimenez-Ibanez et al., 2009). In addition, repeat unit one interacts with the host receptor-like cytoplasmic kinase (RLCK) Pto, while repeat unit two interacts with Pto and a related host RLCK, Fen (Rosebrock et al., 2007; Dong et al., 2009; Mathieu et al., 2014). Of note, in line with the observed sequence diversity, structural analyses have determined that repeat unit one interacts with the Pto kinase in a different orientation to that of repeat unit two with the BAK1 kinase domain (**Figures 1A,B**) (Dong et al., 2009; Cheng et al., 2011). Interestingly, in conjunction with Prf, an immune receptor of tomato, Pto is able to activate host immunity following its interaction with AvrPtoB (Kim et al., 2002; Mucyn et al., 2006; Dong et al., 2009). Fen however, can only activate host immunity in the absence of the E3 ubiquitin ligase domain (Rosebrock et al., 2007). It has now been shown that interaction of either Pto or Fen with repeat unit two results in the proteasome-dependent degradation of these RLCKs as above (Rosebrock et al., 2007; Mathieu et al., 2014). Pto however, is able to resist AvrPtoB-mediated degradation and activate Prfdependent immunity following its interaction with repeat unit one, as this repeat unit is further away from the E3 ubiquitin ligase domain (Mathieu et al., 2014). It has been suggested that Pto and Fen evolved as decoys of the aforementioned noncytoplasmic kinases to provide immunity against Pst (Block and Alfano, 2011).

### CONCLUSION AND PERSPECTIVE

Analyses of protein sequence and tertiary structure have revealed that several effectors of plant-associated organisms are RCPs. As reviewed here, repeat domains play diverse roles in RCP effector function. Furthermore, repeat domains may contribute to the rapid adaptive evolution of RCP effectors, providing a source of functional diversity, flexibility, and/or a means of evading host recognition. With these points in mind, it is perhaps not surprising that increased attention has been given to the identification of RCP effectors from plant-associated organisms (e.g., Mueller et al., 2008; Raffaele et al., 2010; Rudd et al., 2010; Saunders et al., 2012; Rafiqi et al., 2013). Undoubtedly, as (1) more genomes of plant-associated organisms are sequenced; (2) the tools of repeat identification become more powerful; and (3) additional effectors are structurally characterized, many more RCP effectors will be identified. The ongoing challenge will be to understand the precise roles that repeat domains play in the function and adaptive evolution of these effectors. Curiously, many of the repeat domain classes discussed in this review are also co-opted by plants to mediate ligand recognition and/or signaling associated with symbiosis, immunity, as well as physiology and development (Palma et al., 2005; Wang et al., 2006; Laluk et al., 2011; Gust et al., 2012; Böhm et al., 2014; Cui et al., 2015; Kucukoglu and Nilsson, 2015). Thus, as shown for the CLE-like repeats of GrCLE1 from G. rostochiensis (Lu et al., 2009; Guo et al., 2011), it is likely that many RCP effector repeat domains mimic host components associated with these processes to facilitate colonization.

Although not discussed in this review, we acknowledge that repeat domains can be intrinsically disordered (ID); a feature characterized by conformational flexibility and a lack of secondary or tertiary structure under physiological conditions (Dyson and Wright, 2005). In fact, repetitive sequence, along with a preponderance of charged and hydrophilic amino acid residues, is often a hallmark of ID (Dyson and Wright, 2005). Like the ordered (structured) repeat domains described above, ID regions carry out diverse roles in protein function, ranging from providing a flexible linker between structured domains, to mediating protein–protein interactions (Dyson and Wright, 2005). To date, examples of RCP effectors with such a repeat domain architecture remain limited, although ID has been predicted for the P/Q-rich repeats of HopI1, a type III effector from the Brassicaceae leaf spot pathogen, P. syringae pv. maculicola (**Table 1**; Jelenska et al., 2010; Marín and Ott, 2014). Of relevance, many ID regions are known to undergo induced folding upon interaction with their physiological targets, a process that gives rise to the unusual combination of low affinity and high specificity, which may allow these interactions to be readily reversible or may confer flexibility and promiscuity to target binding (Dyson and Wright, 2005). Furthermore, likely owing to a lack of structural constraints, ID protein sequences often evolve at a faster rate than ordered protein sequences, acquiring a greater number of single amino acid substitutions, insertions, deletions, and repeat unit expansions (Brown et al., 2011; Nilsson et al., 2011). Consequently, ID repeat domains are also of great interest to understanding how RCP effectors circumvent host recognition, or acquire novel, altered, and extended effector functionalities that further enhance the colonization of susceptible hosts (Marín et al., 2013; Marín and Ott, 2014).

### AUTHOR CONTRIBUTIONS

CM, JB, and MT conceived the review. CM wrote the manuscript. CM and CH prepared **Figures 1**, **2**. CM and JB constructed **Tables 1**–**3**. CM, JB, CH, and MT critically revised the manuscript. All authors approved the final version of the manuscript.

### REFERENCES


### ACKNOWLEDGMENTS

We thank Erik Rikkerink and Xiaolin Sun (The New Zealand Institute for Plant & Food Research) for critically reviewing the manuscript. CM acknowledges financial support provided by The New Zealand Bio-Protection Research Centre (BPRC), Lincoln University.


of the plant pathogen Pseudomonas syringae. Science 295, 1722–1726. doi: 10.1126/science.295.5560.1722


Kucheryava, N., Bowen, J. K., Sutherland, P. W., Conolly, J. J., Mesarich, C. H., Rikkerink, E. H., et al. (2008). Two novel Venturia inaequalis genes induced upon morphogenetic differentiation during infection and in vitro growth on cellophane. Fungal Genet. Biol. 45, 1329–1339. doi: 10.1016/j.fgb.2008.07.010


NLR receptor to an oomycete effector protein. PLoS Pathog. 11:e1004665. doi: 10.1371/journal.ppat.1004665


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Mesarich, Bowen, Hamiaux and Templeton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Effector-Mining in the Poplar Rust Fungus** *Melampsora larici-populina* **Secretome**

*Cécile Lorrain 1,2, Arnaud Hecker 1,2 and Sébastien Duplessis 1,2 \**

*1 INRA, UMR 1136 Interactions Arbres/Microorganismes INRA/Université de Lorraine, Centre INRA Nancy Lorraine, Champenoux, France, <sup>2</sup> Université de Lorraine, UMR 1136 Interactions Arbres/Microorganismes Université de Lorraine/INRA, Faculté des Sciences et Technologies, Vandoeuvre-lès-Nancy, France*

The poplar leaf rust fungus, *Melampsora larici-populina* has been established as a tree-microbe interaction model. Understanding the molecular mechanisms controlling infection by pathogens appears essential for durable management of tree plantations. In biotrophic plant-parasites, effectors are known to condition host cell colonization. Thus, investigation of candidate secreted effector proteins (CSEPs) is a major goal in the poplar–poplar rust interaction. Unlike oomycetes, fungal effectors do not share conserved motifs and candidate prediction relies on a set of *a priori* criteria established from reported *bona fide* effectors. Secretome prediction, genome-wide analysis of gene families and transcriptomics of *M. larici-populina* have led to catalogs of more than a thousand secreted proteins. Automatized effector-mining pipelines hold great promise for rapid and systematic identification and prioritization of CSEPs for functional characterization. In this review, we report on and discuss the current status of the poplar rust fungus secretome and prediction of candidate effectors from this species.

### *Edited by:*

*Delphine Vincent, Department of Environment and Primary Industries, Australia*

#### *Reviewed by:*

*Liliana M. Cano, North Carolina State University, USA Lei Zhang, North Carolina State University, USA Peter Solomon, The Australian National University, Australia*

#### *\*Correspondence:*

*Sébastien Duplessis duplessi@nancy.inra.fr*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 28 August 2015 Accepted: 11 November 2015 Published: 15 December 2015*

#### *Citation:*

*Lorrain C, Hecker A and Duplessis S (2015) Effector-Mining in the Poplar Rust Fungus Melampsora larici-populina Secretome. Front. Plant Sci. 6:1051. doi: 10.3389/fpls.2015.01051* **Keywords: effector protein, poplar rust, prediction pipeline, expert annotation, multigene families analysis**

## **INTRODUCTION**

Filamentous plant pathogens use secreted molecules for manipulating immunity and physiology of their hosts (Jones and Dangl, 2006; Kamoun, 2009; Win et al., 2012; Okmen and Doehlemann, 2014). Among these, secreted proteins and secondary metabolites can be defined as key players in the outcome and stability of host-parasite interactions with very diverse functions (MacLean et al., 2014; Rovenich et al., 2014; Lo Presti et al., 2015; Pusztahelyi et al., 2015). Chemical effectors (i.e., secondary metabolites) are secreted mainly by necrotrophs and hemibiotrophs during their necrotrophic phase (Kemen et al., 2015). Obligate biotrophs are organisms that grow, feed and reproduce on living host tissues. They exhibit small or very reduced sets of genes encoding secondary metabolites and cell-wall degrading enzymes while they possess large repertoires of effector proteins (Duplessis et al., 2014a; Kemen et al., 2015). In the case of obligate biotrophs such as rust fungi, investigations have largely focused on secreted proteins (SPs) of plant-associated organisms (i.e., the secretome) with potential for being candidate secreted effector proteins (CSEPs).

Rust fungi (Pucciniales, Basidiomycetes) are among the most studied fungal obligate biotrophs due to the degree to which they cause damage to many cultivated plants (Dean et al., 2012). Rust fungi are physically associated with their host cells through the formation of specialized infection structures called haustoria, which are known as secretion sites for effector proteins (Rafiqi et al., 2012; Petre and Kamoun, 2014). The biotrophic life style of rust fungi prohibits virtually all growth on synthetic media and makes genetic transformation very difficult to achieve. Therefore very little is known about the molecular mechanisms underlying the colonization of host tissues by rust fungi and to date, only six rust fungi effector proteins have been reported (see Petre et al., 2014 for review). Next generation sequencing technologies have provided access to genomes or transcriptomes for several rust fungi (Duplessis et al., 2014b). So far, genomes of five rust fungi have been published: the poplar rust *Melampsora larici-populina*, the wheat stem rust *Puccinia graminis* f. sp. *tritici*, the wheat stripe rust *Puccinia striiformis* f. sp. *tritici*, the flax rust *Melampsora lini* and the coffee rust *Hemileia vastatrix* (Cantu et al., 2011, 2013; Duplessis et al., 2011a; Zheng et al., 2013; Cristancho et al., 2014; Nemri et al., 2014). Secretomes of rust fungi have been determined based on the presence of predicted N-terminal signal peptides in proteins (Cantu et al., 2011, 2013; Duplessis et al., 2011a; Fernandez et al., 2012; Hacquard et al., 2012; Saunders et al., 2012; Bruce et al., 2013; Garnica et al., 2013; Zheng et al., 2013; Link et al., 2014; Nemri et al., 2014). Signal peptides can be defined using predictors available online (Emanuelsson et al., 2007). These predictions have revealed the presence of a plethora of SPs in rust fungal species. A recent comparison of genomic features in 84 plant-associated fungi has shown that the proteomes of obligate biotrophs are enriched in SPs, most of which are of unknown function (Lo Presti et al., 2015). This illustrates the importance of studying rust secretomes for identifying potential CSEPs.

The poplar leaf rust fungus *M. larici-populina* causes annual epidemics and severe damage to Northern European poplar plantations. Investigations of the poplar–poplar rust pathosystem using "-omic" approaches have led to significant progress in describing this interaction (Hacquard et al., 2011). The genome of *M. larici-populina* was one of the first rust genomes to be sequenced by an international research consortium (Duplessis et al., 2011a). *In sillico* genome annotation and secretome prediction have been instrumental in unraveling *M. laricipopulina* SPs. Among the 16,399 predicted protein-coding genes reported in the poplar rust genome, 13.3% are predicted SPs (2168 SPs) of which 89.3% have unknown functions (**Figure 1A**). Other secreted proteins correspond to carbohydrate active enzymes (5.8%), lipases (2.3%), proteases (0.8%), and other functions (1.8%) (**Figure 1A**). Extensive genomic and transcriptomic studies have also positioned *M. larici-populina* as being a model tree pathogen for molecular investigations (Hacquard et al., 2010, 2012, 2013; Joly et al., 2010; Duplessis et al., 2011a,b; Petre et al., 2012; Pernaci et al., 2014; Persoons et al., 2014).

### **From** *M. larici-populina* **SPs to CSEPs: Post-genomic Strategies**

Two independent studies have defined different pipelines to pinpoint priority poplar rust CSEPs from catalogs of predicted SPs (**Figure 2**; Hacquard et al., 2012; Saunders et al., 2012). In both studies, *M. larici-populina* secretome was predicted using the same prediction tools. SignalP2.1 was used to sort SPs from the proteome, TargetP1.1 to identify proteins likely retained inside fungal cells (e.g., in mitochondria; Emanuelsson et al., 2000) and TMHMM to exclude proteins carrying transmembrane

α-helix domains (Moller et al., 2001; **Figure 2**). Considering that rust fungal genomes exhibit expanded lineage-specific multigene families compared to other Basidiomycetes, one study used the similarity-based Markov clustering TribeMCL program to group SPs in tribes to further investigate multigene families in *M. laricipopulina* and *P. graminis* f. sp. *tritici* (Enright et al., 2002; Saunders et al., 2012). The second study also utilized TribeMCL clustering but added a second level of annotation with expert curation of *M. larici-populina* SP genes. This led to the definition of SP gene families (Duplessis et al., 2011a; Hacquard et al., 2012). It is worth

noting that these studies used different parameters and fungal species to perform the TribeMCL clustering. By doing so, the initial repertoire of *M. larici-populina* SPs was shown to differ between the two studies (**Figure 2**).

To effectively treat the wide range of predicted SPs in the poplar rust fungus and considering the high divergence and absence of conserved motifs in rust *bona fide* effectors, both effector-mining pipelines focused on *a priori* features of plant pathogen effectors (Hacquard et al., 2012; Saunders et al., 2012). Criteria such as expression during infection or in purified haustoria were applied to prioritize candidates. Moreover, the authors utilized other features such as the size of proteins, the content in cysteine residues, the presence of selection signatures (i.e., genes evolving under the pressure of host resistances), homology to known rust fungi effectors and/or previously reported haustorially expressed secreted proteins (HESPs), as well as organization in genes families taking into account specificity at a given taxonomical level (species, genera, family, order; for review, see Petre et al., 2014).

Effector proteins are often described as small proteins (Stergiopoulos and de Wit, 2009; Tyler and Rouxel, 2012). Based on this observation, an arbitrary cut-off can be applied in CSEPs-mining studies to only focus on small secreted proteins (SSPs), although large effector proteins have also been reported (Petre et al., 2014; Lo Presti et al., 2015). The *Melampsora Genome Consortium* performed a manual curation of *M. larici-populina* SSP gene families (i.e., *<*300 amino acids) taking advantage of expressed sequence tags (ESTs) from haustoria and rust-infected poplar leaves (Joly et al., 2010; Duplessis et al., 2011a; **Figure 2**). Dedicated expert annotation led to the elimination and the addition of several genes encoding SSPs, and notably generated 170 SSPs that had not previously been predicted by automatic annotation (Duplessis et al., 2011a). Manually annotated SSPs were specifically enriched in cysteine residues (on average 2.8% cysteines per protein compared to 1.6% in the whole proteome; **Figure 1B**). SSPs with unknown functions were also clearly enriched in cysteine residues compared with annotated SPs (**Figure 1B**). The proportion of cysteine residues in effectors can indicate the presence of intra-molecular disulfide bridges that could contribute to stabilizing protein structure in inhospitable apoplastic environments (Stergiopoulos et al., 2013). For instance, the *Cladosporium fulvum* Avr2 fungal effector is a cysteine-rich protein, playing a key role in apoplastic protease inhibition during the interaction with tomato leaf cells (Rooney et al., 2005). Although, we cannot definitively dismiss the possibility that cysteine residues could potentially play an important role in the structure of the proteins at their final destination in the host cell. Detailed analysis of SSP gene families taking into account intron/exon organization and cysteine codon positions have revealed certain conserved cysteine patterns (Hacquard et al., 2012). For instance, the largest SSP family composed of 111 members shares the conserved YxC//CxxY//YxC cysteine pattern (Duplessis et al., 2011a; Hacquard et al., 2012). This pattern is reminiscent of the Y/F/WxC motif reported in powdery mildew and wheat rust fungi (Godfrey et al., 2010). Interestingly, other obligate biotrophs also exhibit large repertoires of SPs, such as the white rust oomycete *Albugo laibachii* in which CHXC and CXHC cysteine-rich motifs are found (Kemen et al., 2011). Both types of patterns were speculated to be potentially involved in delivery of effector into host cells.

In the study conducted by Hacquard et al. (2012), numerous effectors features were considered to facilitate selection of the most promising CSEPs: specific expression in infected host tissues, unknown function, homology to known rust effectors and HESPs, specificity to the Pucciniales order and signatures of positive selection (**Figure 2**). Effector-mining studies often use evidence expression during host interaction as a filter to identify critical CSEPs (Lo Presti et al., 2015). Time-course of poplar leaf infection by *M. larici-populina* has revealed dynamic patterns of SSP expression during the early stages of infection, biotrophic growth and sporulation (Duplessis et al., 2011b; Hacquard et al., 2012). Expression in resting and germinating spores can be used to differentiate SSP genes specifically expressed *in planta*. The extensive knowledge of *M. larici-populina* SSP expression at different stages of the life cycle is also critical to pinpoint CSEPs (for detailed reviews, see Duplessis et al., 2012 and Duplessis et al., 2014b). The study performed by Hacquard et al. (2012) did not result in a defined list of CSEPs that might be prioritized for future investigation, but it did provide a comprehensive depiction of the complete repertoire of *M. larici-populina* SSP genes.

Manual curation of large fungal genomes such as rust fungi remains a time-consuming process and automatized pipelines could help to foster CSEP detection. The pipeline built by Saunders and collaborators was initially designed to scrutinize the secretomes of *M. larici-populina* and *P. graminis* f. sp. *tritici* (Saunders et al., 2012). It has subsequently been applied to mine the genomes of different fungi interacting with plants, including the wheat rust *P. striiformis* f. sp. *tritici* and the flax rust *M. lini* (Cantu et al., 2013; Lin et al., 2014; Nemri et al., 2014). This *in sillico* pipeline computes Markov clustering generated tribes taken from available genome annotations (Saunders et al., 2012). This pipeline considers three levels of information for SPs: functional annotation, detection of novel effector motifs and annotation of effector features (**Figure 2**). Most effectors do not have PFAM domains (Kamoun, 2007; Dodds et al., 2009; Stergiopoulos and de Wit, 2009). The functional annotation step allows the selection of SPs with no conserved protein domain families (PFAM), with the exception of avirulence proteins that may have such domains. For instance, the Chitin Binding Modulelike of Avr4 and the LysM domain of Ecp6 in *C. fulvum* both have PFAM annotations. In total, five PFAM domains were found in rust fungi and were considered for their obvious connection with pathogenicity (Saunders et al., 2012). In a second step, the MEME tool was applied to detect *de novo* conserved patterns in rust SPs (Bailey et al., 2009). Among identified motifs, five motifs containing one or two conserved cysteine residues with high positional constraints in SP tribes were highlighted (**Figure 2** Saunders et al., 2012). Interestingly, some motifs such as the motif 06 YxCxYxxCxW, were also identified by the manual annotation of *M. larici-populina* SSP families (Duplessis et al., 2011a; Hacquard et al., 2012). In a final step, common effector features were examined in details, which included induction of expression during host infection, gaging similarity to haustorial ESTs, and determining small protein size (*<*150 amino acids), content in cysteine residues and known effector motifs or repeats in protein. It has been reported that some effectors contain nuclear localization signals (NLS), suggestive of a potential nuclear localization in host cells (Kanneganti et al., 2007; Shen et al., 2007; Schornack et al., 2010; Liu et al., 2011). The presence of such NLS was also added to the features tested. It was shown for different filamentous plant pathogens such as the fungus *Leptosphaeria maculans* and the oomycete *Phytophthora infestans* that effector genes reside in gene-scarce regions marked by the presence of repeat elements such as transposable elements (Haas et al., 2009; Rouxel et al., 2011; for a review see Raffaele and Kamoun, 2012). This criterion was also taken into account in the pipeline by looking at the presence of long intergenic regions around SP genes. However, no significant link could be established between SSP genes and repeat-rich regions in the genomes *M. larici-populina* and *P. graminis* f. sp. *tritici* (Duplessis et al., 2011a). Nonetheless, all these filters are informative and can be computed in a complex matrix comparing rust tribes in order to rank CSEP (**Figure 2**; Saunders et al., 2012).

## **A Priori Criteria to Prioritized CSEPs**

While it may be possible to infer typical features from effector proteins, a given effector will rarely exhibit a combination of all features at the same time. Conversely, each feature will display a diverse distribution among SP families. Thereby, hierarchical clustering can be performed for ranking tribes with the highest probability of containing CSEPs (Saunders et al., 2012). When using this type of clustering approach, the weight of each criterion requires adjustment. By doing this, Saunders and collaborators were able to derive four clusters with the most promising SP tribes that could potentially correspond to CSEPs for further investigation. The largest tribe, with 92 members in one of these clusters, is specific to *M. larici-populina* and contains a large proportion of secreted proteins (73% with predicted signal peptide; Saunders et al., 2012). This tribe corresponds to the largest poplar rust SSP family (with 111 members) as reported by Duplessis et al. (2011a). This SSP family is marked by the presence of highly conserved cysteine patterns, which both studies highlighted. The difference in numbers of SSP members likely corresponds to the different levels of gene annotation considered for the *M. larici-populina* genome. The two studies identified tribes composed of SPs/SSPs and proteins without predicted signal peptides. In some tribes, SPs exhibited homology to HESPs, as well as known rust effector proteins (e.g., *M. lini* ArvM). It could be speculated that such proteins are involved in haustoria functions (Saunders et al., 2012). It could also reflect the evolution of these families with a gain or a loss of signal peptide toward the generation of new putative effector functions.

In a recent study, an effectoromic pipeline identified priority *M. larici-populina* CSEPs for expression in *Nicotiana benthamiana* as a heterologous system to study their localization in plant subcellular compartments and to identify potential plant interactors (Petre et al., 2015a). The pipeline was applied to the catalog of SSPs previously reported by Hacquard et al. (2012). Priority CSEPs were selected by giving a stronger weight to some of the typical criteria used in the two studies reported above. For instance, expression in haustoria, specific induction of expression during poplar infection, specificity to the Pucciniales and proteins of unknown functions were the most important features considered. These criteria were similar to those systematically applied for tribe ranking in Saunders et al. (2012). Redundant family members were also removed in order to focus on orphan and lineage-specific CSEPs, considering that pathogenicity mechanisms imply highly specific functions. Such stringent criteria led to a subset of 24 priority CSEPs from 1184 initial *M. larici-populina* SSPs (Petre et al., 2015a). Among these, only three belong to priority clusters identified in the study by Saunders et al. (2012), in which an initial postulate was to focus on tribes and not on orphan genes. One supposed orphan SSP (CSEP 107772) finally proved to be member of a *M. larici-populina* multigene family (Petre et al., 2015b). This particular example illustrates how automatic and dedicated pipelines strongly depend on the accuracy of genome annotation and parameters applied to gene family analysis tools, above any further selection criteria. Among the 24 selected CSEPs, 20 could be expressed in fusion with GFP in *N. benthamiana*, which further allows identification of specific localisation in plant cell compartments as well as potential plant interactors through coimmunoprecipitation and mass spectrometry. This study identified six *M. larici-populina* CSEPs with a specific localization pattern (i.e., nucleus, nucleolus, chloroplast, mitochondria, and cytosolic bodies) and five were specifically associated with plant proteins representing potential interactors (Petre et al., 2015a). The development of *in planta* assays in heterologous systems has provided the first step toward effector characterization for various pathogens (Bozkurt et al., 2011; Caillaud et al., 2012, 2013; Petre et al., 2015a,b). Another alternative already proven successful for different obligate biotrophic fungi, including rust fungi, is host-induced gene silencing (Panwar et al., 2013; Spanu, 2015). Knowledge of effector structure is useful to understand effector/plant protein interactions and to find structure homology within effectors (Maqbool et al., 2015). *In vitro* structural biology of effectors is also an option of choice to further determine the role of effectors in rust fungi, pending fruitful production of recombinant CSEPs.

Pucciniales genomes are marked by the presence of very large catalogs of SPs. Among these only a fraction may be *bona fide* effectors. Effector-mining pipelines, while still imperfect, are crucial to highlighting the most crucial CSEPs in these fungal species. No robust ubiquitous effector motifs have been found in rust fungal effectors, contrary to the RXLR motif of oomycetes effectors (Petre and Kamoun, 2014; Sperschneider et al., 2015). As illustrated here, effector mining in the poplar rust fungus relies both on the quality of input data (i.e., gene annotation and gene

### **REFERENCES**


families analysis) and on several qualitative and subjective criteria. Indeed, there is no absolute rule for determining rank and weight among the many effector features that may be considered. As illustrated here, knowledge about SP gene expression is a key criterion for selecting priority CSEPs. Considering the complex life cycle of heteroecious rust fungi like *M. larici-populina* (i.e., alternation on two different hosts and production of five different spores), insight of gene expression patterns throughout the entire life cycle can help to drastically reduce the overall catalog of CSEPs. Pucciniales consist of more than 8000 species (Aime et al., 2014). To date, less than 10 genomes have been published or sequenced with partial data available (see Duplessis et al., 2014b for review). All rust fungal genomics and transcriptomics reports have shown that these species contain a high content of Pucciniales or species-specific genes. There is a strong probability that rust fungi possess highly specific effectors and increasing the amount of genomic data will surely help to focus efforts toward CSEPs identification. Still highly anticipated is future development of dedicated bioinformatics tools for predicting fungal effectors (Sperschneider et al., 2015).

### **AUTHOR CONTRIBUTIONS**

All the authors wrote and revised the manuscript.

### **ACKNOWLEDGMENTS**

The authors warmly thank Emmanuelle Morin (INRA Nancy) for her continuous support with *M. larici-populina* genomics and Aimee Orsini for English language editing. Cécile Lorrain is supported by INRA, in the framework of a Contrat Jeune Scientifique and by the Labex ARBRE. SD acknowledges the support of the French ANR for the young scientist grant POPRUST (ANR-2010-JCJC-1709-01) and of the Région Lorraine. All authors benefit from the support of the ANR in the frame of a grant part of the "Investissements d'Avenir" program (ANR-11-LABX-0002-01, Lab of Excellence ARBRE).

plant susceptibility, *Plant J.* 69, 252–265. doi: 10.1111/j.1365-313x.2011. 04787.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Lorrain, Hecker and Duplessis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Functional redundancy of necrotrophic effectors – consequences for exploitation for breeding

*Kar-Chun Tan1\*†, Huyen T. T. Phan1†, Kasia Rybak1, Evan John1, Yit H. Chooi2,3, Peter S. Solomon2 and Richard P. Oliver1\**

*<sup>1</sup> Centre for Crop Disease Management, Department of Environment and Agriculture, Curtin University, Bentley, WA, Australia, <sup>2</sup> Plant Sciences Division, Research School of Biology, Australian National University, Canberra, ACT, Australia, <sup>3</sup> School of Chemistry and Biochemistry, University of Western Australia, Perth, WA, Australia*

*Edited by: Pietro Daniele Spanu, Imperial College London, UK*

#### *Reviewed by:*

*Guus Bakkeren, Agriculture and Agri-Food Canada, Canada Mahmut Tör, University of Worcester, UK*

#### *\*Correspondence:*

*Richard P. Oliver and Kar-Chun Tan, Centre for Crop Disease Management, Department of Environment and Agriculture, Curtin University, Bentley, WA 6102, Australia richard.oliver@curtin.edu.au; kar-chun.tan@curtin.edu.au †These authors have contributed*

> *equally to this work. Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 22 April 2015 Accepted: 22 June 2015 Published: 08 July 2015*

#### *Citation:*

*Tan K-C, Phan HTT, Rybak K, John E, Chooi YH, Solomon PS and Oliver RP (2015) Functional redundancy of necrotrophic effectors – consequences for exploitation for breeding. Front. Plant Sci. 6:501. doi: 10.3389/fpls.2015.00501* Necrotrophic diseases of wheat cause major losses in most wheat growing areas of world. Tan spot (caused by *Pyrenophora tritici-repentis*) and septoria nodorum blotch (SNB; *Parastagonospora nodorum*) have been shown to reduce yields by 10–20% across entire agri-ecological zones despite the application of fungicides and a heavy focus over the last 30 years on resistance breeding. Efforts by breeders to improve the resistance of cultivars has been compromised by the universal finding that resistance was quantitative and governed by multiple quantitative trait loci (QTL). Most QTL had a limited effect that was hard to measure precisely and varied significantly from site to site and season to season. The discovery of necrotrophic effectors has given breeding for disease resistance new methods and tools. In the case of tan spot in West Australia, a single effector, PtrToxA and its recogniser gene *Tsn1*, has a dominating impact in disease resistance. The delivery of ToxA to breeders has had a major impact on cultivar choice and breeding strategies. For *P. nodorum,* three effectors – SnToxA, SnTox1, and SnTox3 – have been well characterized. Unlike tan spot, no one effector has a dominating role. Genetic analysis of various mapping populations and pathogen isolates has shown that different effectors have varying impact and that epistatic interactions also occur. As a result of these factors the deployment of these effectors for SNB resistance breeding is more complex. We have deleted the three effectors in a strain of *P. nodorum* and measured effector activity and disease potential of the triple knockout mutant. The culture filtrate causes necrosis in several cultivars and the strain causes disease, albeit the overall levels are less than in the wild type. Modeling of the field disease resistance scores of cultivars from their reactions to the microbially expressed effectors SnToxA, SnTox1, and SnTox3 is significantly improved by including the response to the triple knockout mutant culture filtrate. This indicates that one or more further effectors are secreted into the culture filtrate. We conclude that the *in vitro*-secreted necrotrophic effectors explain a very large part of the disease response of wheat germplasm and that this method of resistance breeding promises to further reduce the impact of these globally significant diseases.

Keywords: septoria, nodorum, stagonospora, phaeosphaeria, necrotrophic fungi, effectors

### Introduction

Effectors are defined as molecules that are produced by microbial pathogens that interact with specific "recognition" gene products in the plant host so as to affect the outcome of the contact – disease susceptibility or resistance (Tyler and Rouxel, 2012; Rovenich et al., 2014; Vleeshouwers and Oliver, 2014). Pathogen species are believed to harbor up to a few hundred such effectors and plant host species contain a much larger number of recognition genes to cope with the plethora of pathogens with which they will come into contact.

Plant pathogens are conventionally described as either biotrophic, in which the infected host tissue remains alive or necrotrophic, where host tissues are killed. In typical biotrophic interactions, recognition of an effector by a specific plant recognition gene (normally termed an *R*-gene) leads to a defense response and elimination of the pathogen. Recognition of just one such effector is sufficient to render the others redundant (Stotz et al., 2014). Such functional dominance is characteristic of interactions of biotrophic pathogens and accounts for the marked differential between resistance and susceptibility in these diseases.

Necrotrophic pathogens also produce effectors that induce a defense like response upon recognition by an R-gene-like recogniser protein (Lorang et al., 2007; Faris et al., 2010) but are distinguished by their ability to survive in the affected plant tissue and go on to proliferate and sporulate. Indeed, necrotrophic disease is clearly promoted by the recognition of effectors (Oliver and Solomon, 2010).

Like biotrophic pathogens, necrotrophs also produce many effectors. The question we address here is how do the multiple effector/recogniser interactions cooperate to cause disease? Necrotrophic diseases are typically quantitative in nature; the null-hypothesis is that each effector/recogniser interaction operating in a given situation acts additively to produce the degree of necrosis and that this directly translates into a corresponding level of disease. Depending on the context, the level of disease can be defined in terms of either as loss of yield or quantity of pathogen sporulation.

This question has practical importance as effector recognition has been adopted by breeders as a partial proxy for field resistance testing (Vleeshouwers and Oliver, 2014). Can the disease resistance of new cultivars be accurately predicted from the response to effectors of the input germplasm?

*Parastagonospora* (syn. ana, *Stagonospora*; teleo, *Phaeosphaeria*) *nodorum* (Berk.; Quaedvlieg, Verkley, and Crous) is the causal agent of Septoria nodorum blotch (SNB) on wheat (Solomon et al., 2006a; Quaedvlieg et al., 2013) and is responsible for significant yield losses in some areas of the world (Murray and Brennan, 2009; Oliver et al., 2012). Losses in the West Australian wheat belt amount to greater than AUD\$100 m pa. Breeding for disease resistance has been a priority but has been hampered by the quantitative nature of the interaction. Wheat genetic analysis using infection assays as the phenotype has revealed a multitude of quantitative trait loci (QTL) and efforts to define molecular markers acceptable to breeders has proved frustrating (Oliver et al., 2012).

The discovery of multiple necrotrophic effectors has provided a clear framework to dissect the disease and has provided breeders with much needed tools (Friesen et al., 2006, 2008, 2009, 2012; Chu et al., 2010; Faris et al., 2011; Waters et al., 2011; Abeysekara et al., 2012; Crook et al., 2012; Liu et al., 2012; Oliver et al., 2012; Tan et al., 2012; McDonald et al., 2013). Our working hypothesis is that the disease can be explained by the interaction of effectors (which all appear to be small proteins secreted into culture media) and their corresponding recognition genes. Genetic analysis of the response to purified effectors has identified several wheat genetic loci that correspond to regions conferring susceptibility to the disease.

Thus far, three necrotrophic effector genes have been cloned from *Parastagonospora nodorum*. These are *SnToxA* (for which the recognition gene *Tsn1* has been cloned; Liu et al., 2006; Faris et al., 2010), *SnTox3* (Liu et al., 2009), and *SnTox1* (Liu et al., 2012). The corresponding recognition genes *Snn3* and *Snn1* have been mapped but not yet cloned. Furthermore, it is clear that several other effectors operate in this pathosystem (Friesen et al., 2012; Tan et al., 2014; Gao et al., 2015).

We have previously examined the degree of correlation between the effector sensitivities of current cultivars of West Australian cultivars and their reported field resistance (Tan et al., 2014). Unlike the tan spot system for which there is a clear dominance of one effector/recogniser interaction (ToxA/*Tsn1*; Moffat et al., 2014), no single effector had a similarly dominating role in SNB. One consideration was that all the current cultivars were sensitive to at least one of these three effectors SnToxA, 1 or 3.

Our overall goal is to model the disease susceptibility of wheat cultivars from knowledge of their effector sensitivities. We seek to understand the relative importance of each known effectors, formulate strategies to identify novel effectors/their corresponding host recogniser genes/QTLs and determine how they interact in the SNB interaction.

In this study, we have constructed a *P. nodorum* strain deleted in *SnToxA*, *SnTox1,* and *SnTox3*. This approach allows us to detect further secreted effectors and determine how recognition corresponds to disease expression without the interference of *SnToxA*, *SnTox1,* and *SnTox3*. We demonstrate that the removal of the three effectors reduced but did not entirely eliminate pathogenicity. We conclude that the secreted necrotrophic effector-recogniser model remains sufficient to explain the disease and provides a useful framework for cultivar resistance breeding.

### Materials and Methods

### Wheat Cultivars

All wheat cultivars used in this study were obtained from the Australian Winter Cereal Collection (Tamworth, NSW, Australia) and grown in vermiculite in a growth chamber at 21◦C with a 12 h light/dark cycle for 2 weeks prior to infection or infiltration. Current SNB disease resistance ratings (DRR) of commercial cultivars were obtained from the Department of Agriculture and Food Western Australia (DAFWA; Shackley et al., 2013). For statistical purposes, a numerical scoring system was assigned to all DRR categories: (1) very susceptible; (2) susceptible–very susceptible; (3) susceptible; (4) moderately susceptible–susceptible; (5) moderately susceptible; (6) moderately resistant–moderately susceptible. Note that no cultivars are scored in categories 7 to 10.

### *SnToxA, SnTox1,* and *SnTox3* Triple Gene Deletions in *P. nodorum*

*Parastagonospora nodorum* SN15 strains deleted in *ToxA*, *SnTox1,* and *SnTox3* (*toxa13*) were created through sequential transformations using homologous-gene knockout vectors that were generated from fusion PCR (Solomon et al., 2006b) and Gibson assembly (Gibson et al., 2009; **Table 1**). All PCR amplifications were performed with Phusion *Taq* DNA polymerase (New England Biolabs, Ipswich, MA, USA). The *SnTox3* deletion construct harboring a phleomycin resistance cassette described in Tan et al. (2014) was transformed into SN15 *tox18* carrying a *SnToxA* deletion to facilitate gene knockout. PCR was used to identify the appropriate mutants deleted in *SnTox3*. A robust quantitative PCR was used to determine the integration copy number of *SnTox3* deletion constructs in all transformants (Solomon et al., 2008). The mutant *toxa3*-10 carrying a single copy *SnTox3* deletion vector insertion was subsequently selected for *SnTox1* deletion (**Supplementary Figure S1**). The *SnTox1* deletion vector was constructed using the Gibson assembly mix (New England Biolabs, Ipswich, MA, USA). The 5 and 3 UTR regions of SnTox1 were PCR amplified from genomic DNA using 5\_Tox1Fbar, 5\_Tox1Rbar, 3\_Tox1Fbar, and 3\_Tox1Rbar. Both flanking regions were simultaneously fused to the *Bar* gene (phosphinothricin acetyl transferase) derived from pBARKS1 (obtained from Fungal Genetics Stock Center) and pGEMT-Easy for propagation (**Table 1**; **Figure 1**). The resulting gene knockout vector was PCR-amplified for transformation to facilitate gene knockout according to Solomon et al. (2004) with modifications. Glufosinate was extracted from commercial herbicide Basta containing 200 g.l−<sup>1</sup> glufosinate ammonium (Bayer Cropscience,


*The bold text refers to overlapping sequences required to facilitate appropriate recombination of DNA fragments during Gibson assembly.*

Monheim, Germany) using chloroform as described previously (Nayak et al., 2006). An equal volume of chloroform was mixed vigorously with the Basta herbicide and centrifuged at 6,000*g* for 30 min. The upper aqueous layer containing glufosinate was retained. *Bar* transformants were selected on minimal medium containing 13 mM NH4Cl as the sole nitrogen source and 8 µl.ml−<sup>1</sup> of extracted glufosinate. PCR was used to identify the appropriate mutants deleted in *SnTox1*. Quantitative PCR was used to determine the integration copy number of *SnTox1* deletion constructs in all transformants according to Solomon et al. (2008) with modifications. 5\_Tox1qPCRF (5 -CGTAAAGAGCCGAAGATATGCC-3 ) and 5\_Tox1qPCRR (5 - ATAGCCCAACAGATAGGCCC-3 ) were used to amplify 123 bp of the 5 UTR homologous region immediately adjacent to the *Bar* marker of the KO cassette. Wildtype *SnTox1* was as a standard control for copy number determination of the *SnTox1* knockout cassette in the mutants. All fungal strains used in this study were described in **Table 2**.

### Production of Necrotrophic Effectors and Infiltration

Necrotrophic effectors were produced from growth in Fries 3 medium broth as described in Liu et al. (2004). Culture filtrate containing effectors were sequentially filtered gauze, miracloth, Whatman paper, and 0.22 µm sterilizers prior to infiltration with a needleless 1 cc syringe. The necrosis reaction was scored at 10 days according to visual score of 0 to 3 as previously described (Waters et al., 2011). A score of 0 indicates insensitivity (no reaction); 1, slight chlorosis; 2, extensive chlorosis; and 3, necrosis. Varieties that scored 1 were considered weakly sensitive, whereas those that scored 2 or 3 were considered highly sensitive to the effector preparation. Infiltration assays were performed on the first leaf of 2-week old seedlings.

### Whole Plant Infection Assay

Whole plant infection assay was performed on 2 week old wheat seedlings as described (Solomon et al., 2005). Briefly, 2 week old wheat seedlings were sprayed with 1 <sup>×</sup> <sup>10</sup><sup>6</sup> pycnidiospores suspended with 0.5% w/v gelatin using an air brush system. To facilitate the infection process, all seedlings were covered for 2 days to increase humidity. After this, plants were uncovered and the infection process was allowed to continue for an additional 5 days prior the assessment of the disease symptom. A score of 1 indicates no disease symptoms were observed and a score of 9 indicates a fully necrotised plant.

### Statistical Analyses

Statistical analysis was performed using JMP 10.0.0 (SAS Institute, Cary, NC, USA) using a 2 × 2 Pearson's chi-square test was used to test effector sensitivity datasets and SNB DRR for evidence of correlation (Tan et al., 2014). As Chi-square analyses on individual effector sensitivity scores vesus SNB DRR classes resulted in expected values less than one. As such, combining classes was used to overcome this problem on a 2 × 2 Pearson's chi-square test (Tan et al., 2014). This approach enables a

TABLE 2 | *Parastagonospora nodorum* strains used in this study.


significant association to be demonstrated between SNB DRR and effector insensitivity. Crude culture filtrate sensitivity scores of 2 and 3 were pooled and scores of 0 and 1 were similarly pooled. As previously described in Tan et al. (2014), wheat cultivars that carry SNB DRR of 5 and 6 were pooled separately from scores 1 to 4. Cultivars with mixed effector sensitivity were treated as missing values by the statistical software.

### Results

### Deletion of SnToxA, SnTox1, and SnTox3 in *P. nodorum* SN15

SN15 is an aggressive *P. nodorum* wildtype isolate that carries *SnToxA*, *SnTox3,* and *SnTox1* (Hane et al., 2007; Syme et al., 2013). To develop a strain deleted for all three genes, we sequentially removed *SnTox3* and *SnTox1* from *P. nodorum tox18,* a SN15 transformant deleted in *SnToxA* (Friesen et al., 2006). Using the previously described *SnTox3* knockout vector (Tan et al., 2014), we were able to generate 14 phleomycin resistant transformants, of which two carried deletant mutants.

the desired *SnTox3* deletion as determined by PCR. This represents 14.3% homologous recombination efficiency. We then selected *toxa3*-10 for *SnTox1* deletion. The transformation yielded 27 glufosinate-resistant transformants, of which five contained the desired *SnTox1* deletion. This represents 18.5% homologous recombination efficiency. We then selected four *SnTox1* knockout strains from the *toxa3-10* background to analyse for insert copy number (**Supplementary Figure S2**). All strains possess a single integration of the *SnTox1* knockout*-Bar* cassette at the *SnTox1* locus. From here, *toxa13-6* was selected for subsequent analyses.

### Characterization of *P. nodorum toxa13-6*

*Parastagonospora nodorum toxa13-6* was tested for its ability to produce SnToxA, SnTox1, and SnTox3 *in vitro*. Culture filtrate of the mutant was infiltrated into wheat cultivars BG261 (*Tsn1*, *snn1,* and *snn3*), Chinese Spring (*tsn1*, *Snn1,* and *snn3*), and BG220 (*tsn1*, *snn1,* and *Snn3*). Necrotic/chlorotic symptoms that are associated with compatible effector responses were not observed confirming the deletion of the genes (data not shown).

The activity of the *toxa13-6* culture filtrate was assessed on 46 Australian commercial wheat cultivars (**Figure 2**). Ten cultivars are highly sensitive to the culture filtrate, resulting in significant chlorosis and necrosis whereas nine were mildly sensitive. It was observed that Cv. Magenta segregated in sensitivity to the culture filtrate.

The triple deletion of *SnToxA*, *SnTox1,* and *SnTox3* evidently produces further *n* necrosis-inducing factors. We then compared *toxa13-6* culture filtrate sensitivity and SNB DRR using

FIGURE 2 | Reactions of 46 Australian wheat cultivars to effectors [ **<sup>∧</sup>**, from Tan et al. (2014)] and the *Parastagonospora nodorum toxa13-6* culture filtrate [#, this study]. <sup>∗</sup>SNB disease rating was obtained from Shackley et al. (2013). VS, very susceptible; S-VS,

susceptible-very susceptible; S, susceptible; MS-S, moderately susceptible-susceptible; MS, moderately susceptible; MR-MS, moderately resistant-moderately susceptible. Effector sensitivity scores are described in Supplementary Table S1.

frequency counts in a 2 × 2 mosaic plot (**Figure 3A**). No significant correlation was observed between *toxa13-6* culture filtrate sensitivity and SNB DRR (*p* = 0.6508). The combined SnToxA, SnTox1, and SnTox3 sensitivity scores correlated with the variety DRRs correlated marginally above the *p* = 0.05 significance threshold (**Figure 3B**). However, when *toxa13-6* culture filtrate scores were combined with the SnToxA, SnTox1, and SnTox3 sensitivity scores of each wheat variety (**Figure 3C**), a strongly significant correlation was observed with the SNB DRR (*p* = 0.0239). This indicates that novel necrosis inducing factors in the *toxa13-6* culture filtrate positively contribute to the severity of SNB.

The ability of *toxa13-6* to infect wheat was assessed using a whole plant spray on selected wheat cultivars that are highly sensitive to the culture filtrate. It was observed *P. nodorum toxa13-6* can infect Calingiri, Emu Rock, and Halberd similarly to the wildtype SN15 (**Figure 4**). The other four *P. nodorum toxa13* mutants produced chlorosis/necrosis-inducing culture filtrates and remained infective on Calingiri, Emu Rock, and Halberd (data not shown).

### Discussion

Classical genetic studies indicate that SNB resistance in wheat is a polygenic trait (Wicki et al., 1999; Czembor et al., 2003). Research since 2001 (reviewed in Oliver et al., 2012) has shown that the SNB interaction involves a complex interplay of fungal effector and host dominant susceptibility genes that are necessary and sufficient to explain the polygenic and quantitative nature of the interaction. Most wheat varieties are sensitive to more than one effector and most pathogen isolates produce more than one effector.

All wheat cultivars used in this study are sensitive to one or more known effectors. This hinders the discovery of novel effector discovery through the use of *P. nodorum* culture filtrate infiltration. Furthermore, the presence of multiple QTLs which could due to the presence of multiple effector/receptor interactions can make the study of a targeted single interaction difficult as other interactions may introduce bias that mask its effect. Therefore, positional gene cloning may proof impossible under these circumstance. To overcome these difficulties, we developed pathogenic *P. nodorum* strains that are deleted in *SnToxA*, *SnTox1,* and *SnTox3* as a tool to discover novel effectors and SNB/sensitivity QTLs in wheat that were previously masked or unassigned. We have achieved this through the use of selectable marker genes that confer resistance to hygromycin (Solomon et al., 2004; Oliver et al., 2012) and phleomycin (Tan et al., 2008). In this study, we have implemented *Bar* as a third selectable marker for *P. nodorum* transformation. *Bar* has been adapted for use as a selectable marker in other fungal system (Avalos et al., 1989; Nayak et al., 2006; Chooi et al., 2010). Nourseothricin and G418 were tested on *P. nodorum* SN15 as potential antibiotics for fungal transformation using their respective selectable marker genes. However, *P. nodorum* showed a high level of natural tolerance to these antibiotics and cannot be used for the development of transgene resistance.

The acquisition of the triple effector knockout strain is a tool that can be used to assess the presence of further effectors relevant to commercially important wheat cultivars. These novel effectors can then be identified using biochemical separation methods. In addition, their role in virulence will be assessed through the generation of gene deletion mutants. This approach will require the deletion of additional genes in the *P. nodorum*

*toxa13* background. To overcome limitations in marker-based selection, a selectable marker recycling system using *Cre-loxP* recombination is being developed and adapted for functional gene analysis in *P. nodorum* (Mizutani et al., 2012). Novel effectors that are verified for their role in the establishment of SNB can be implemented as a tool in resistance breeding (Vleeshouwers and Oliver, 2014).

A broad correlation between disease severity and the additive effect of effectors was observed (**Figure 4**). Ten wheat varieties showed strong sensitivity reactions to the *toxa13-6* culture filtrate. We also demonstrated that three of these wheat cultivars are highly susceptible to SNB caused by the mutant. This clearly indicates that the major effectors are secreted and function as disease determinants (**Figure 3**). This approach will enable the selection of wheat cultivars that show differential sensitivity to the *P. nodorum toxa13* culture filtrate and the construction of wheat mapping populations to genetically identify novel QTLs that confer effector sensitivity and disease susceptibility/resistance. From here, reliable genetic markers that are closely linked with QTLs of interest will be identified and used as a tool in resistance breeding in parallel with effectors to facilitated the ultimate removal of dominant sensitivity traits in wheat (Oliver et al., 2013; Vleeshouwers and Oliver, 2014).

The response of cultivars to necrosis and chlorosis-inducing effectors in the culture filtrate secretome can be compared to the field responses. The correlation between effector sensitivity and DRR is complex. We showed previously that there a significant correlation to SnTox3 (Tan et al., 2014) using DRR data available at that time. Here we show that the best correlation with the current DRR is when reactions to the three cloned effectors plus the culture filtrate from the triple mutants are combined. The correlation is significant (**Figure 3B**) and so this indicates that breeding by selecting germplasm that is insensitive to the three effectors and the culture remains a viable strategy but that functional redundancy exists between effectors. Purification of the effectors and their individual use should improve the correlation still further.

Nonetheless, whilst the correlation is significant, it is not a simple additive reaction. Epistatic effects have been observed, whereby SnToxA or SnTox1 sensitivity has been found to eliminate the reaction to SnTox3 (Oliver et al., 2012). Different effectors have different effector activity, different variants of effectors have different effector activity and different recogniser genes have different responses (Tan et al., 2012). This indicates that whilst elimination of effector sensitivities from breeding programs will lead ultimately to improved resistance, individual steps may have an impact that is too small to be noticeable or indeed may be zero. Conversely, the impact of elimination of effector sensitivities is predicted never to be negative. Trade-offs in disease resistance between resistance and susceptibility have been found in the case of some genes conferring resistance to biotrophic pathogens, such effects have not yet been found for necrotrophic effectors sensitivities (Oliver et al., 2013). Whilst we need to be vigilant for cases where the elimination of effector sensitivities has a negative pleotropic effect, the current necrotrophic effector model is both necessary and sufficient to explain all that is known about these disease interactions and to inform strategies for disease resistance.

### Acknowledgment

This study was funded by the Grains Research and Development Corporation research grant CUR00023 (Programme 3).

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal*.*frontiersin*.*org/article/10*.*3389/fpls*.*2015*.*00501

FIGURE S1 | Copy number of the SnTox3-phleomycin resistance gene knockout cassette normalised to a single copy of actin (Act1). The experiment was performed in biological triplicates. Standard error bars are shown.

FIGURE S2 | Copy number of the SnTox1-glufosinate resistance gene knockout cassette normalised to a single copy of actin (Act1). The experiment was performed in biological triplicates. The background strain toxa3-10 was included as a single copy control. The toxa13-2 ectopic mutant was included as a 5' UTR SnTox1 multicopy control. Standard error bars are shown.

### References


pathosystem parallels that of the wheat-tan spot system. *Genome* 49, 1265–1273. doi: 10.1139/g06-088


pathogens. *Mol. Plant Microbe Interact.* 27, 196–206. doi: 10.1094/MPMI-10- 13-0313-IA


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Tan, Phan, Rybak, John, Chooi, Solomon and Oliver. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Surveying the potential of secreted antimicrobial peptides to enhance plant disease resistance

Susan Breen<sup>1</sup> , Peter S. Solomon<sup>1</sup> , Frank Bedon2, 3 and Delphine Vincent <sup>2</sup> \*

<sup>1</sup> Plant Sciences Division, Research School of Biology, The Australian National University, Canberra, ACT, Australia, <sup>2</sup> Department of Economic Development, AgriBio, Bundoora, VIC, Australia, <sup>3</sup> AgriBio, La Trobe University, Bundoora, VIC, Australia

Antimicrobial peptides (AMPs) are natural products found across diverse taxa as part of the innate immune system against pathogen attacks. Some AMPs are synthesized through the canonical gene expression machinery and are called ribosomal AMPs. Other AMPs are assembled by modular enzymes generating nonribosomal AMPs and harbor unusual structural diversity. Plants synthesize an array of AMPs, yet are still subject to many pathogen invasions. Crop breeding programs struggle to release new cultivars in which complete disease resistance is achieved, and usually such resistance becomes quickly overcome by the targeted pathogens which have a shorter generation time. AMPs could offer a solution by exploring not only plant-derived AMPs, related or unrelated to the crop of interest, but also non-plant AMPs produced by bacteria, fungi, oomycetes or animals. This review highlights some promising candidates within the plant kingdom and elsewhere, and offers some perspectives on how to identify and validate their bioactivities. Technological advances, particularly in mass spectrometry (MS) and nuclear magnetic resonance (NMR), have been instrumental in identifying and elucidating the structure of novel AMPs, especially nonribosomal peptides which cannot be identified through genomics approaches. The majority of non-plant AMPs showing potential for plant disease immunity are often tested using in vitro assays. The greatest challenge remains the functional validation of candidate AMPs in plants through transgenic experiments, particularly introducing nonribosomal AMPs into crops.

Keywords: transgenic plants, mass spectrometry, plant-microbe interaction, pathogen resistance, ribosomal and nonribosomal antimicrobial peptides, immunity

### INTRODUCTION

Peptides with antimicrobial activities, or antimicrobial peptides (AMPs), have emerged as key components of the innate immune system in almost all living organisms since their first discovery from culture supernatant of the soil bacteria Bacillus brevis more than seven decades ago (Dubos, 1939a,b). AMPs are naturally synthesized low molecular mass products [up to 100 amino acids (AAs)] that act against microbial pathogens. AMPs are structurally and biochemically highly diverse but typically include positively charged AAs and hydrophobic or hydrophilic moieties facilitating more or less their aqueous solubility and interaction with the negatively charged parts of the phospholipidic microbial cell membranes (reviewed in Montesinos, 2007; Mousa and Raizada, 2015; Wang et al., 2015). As a consequence of their diversity, specific AMPs are more effective in

#### Edited by:

Vincenzo Lionetti, "Sapienza" Università di Roma, Italy

#### Reviewed by:

Antonio Molina, Centro de Biotecnología y Genómica de Plantas (UPM-INIA), Spain Hiroaki Shimada, Tokyo University of Science, Japan

\*Correspondence: Delphine Vincent delphine.vincent@ecodev.vic.gov.au

#### Specialty section:

This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science

Received: 14 August 2015 Accepted: 09 October 2015 Published: 27 October 2015

#### Citation:

Breen S, Solomon PS, Bedon F and Vincent D (2015) Surveying the potential of secreted antimicrobial peptides to enhance plant disease resistance. Front. Plant Sci. 6:900. doi: 10.3389/fpls.2015.00900 interacting and disrupting targeted microbial membranes, such as fungal pathogens for example (Marx, 2004; Hegedüs and Marx, 2013; Vincent and Bedon, 2013; van der Weerden et al., 2013).

AMPs can be classified as either ribosomal or nonribosomal according to their mode of synthesis by the cells. Ribosomal peptides are gene-encoded peptides usually resulting from cleavage of a pro-protein, while nonribosomal peptides are assembled by multimodular enzymes called NonRibosomal Peptide Synthetases (NRPS). These NRPSs are usually organized in one operon, for bacteria, or in gene clusters for eukaryotes, and they synthesize one peptide per gene cluster or operon. NPRSs can generate macrocyclic peptides with an unusual structural diversity achieved through the assembly of not only the 20 canonical AAs, but also D-configured- and β-AAs, methylated, glycosylated and phosphorylated residues, heterocyclic elements and even fatty acid (FA) chains (Marahiel, 2009). Cyclic lipopeptides containing FAs are examples of such nonribosomal peptides (for review Lee and Kim, 2015; Patel et al., 2015). Ribosomal peptides are difficult to predict in silico from transcriptomic and genomic sequencing projects due to their small size and high diversity. There are also no known generic cleavage sites that could indicate a potential peptide. Tools are available allowing in silico genome mining, as was demonstrated for Bacillus sp. (Aleti et al., 2015). The most comprehensive AMPs database to date, named ADAM, is publically available and currently contains 7007 unique peptide sequences and 759 structures (http://bioinformatics.cs. ntou.edu.tw/ADAM/links.html) (Lee et al., 2015), yet some of the nonribosomal AMPs listed in this review are missing. Other AMP databases exist (reviewed in Holaskova et al., 2015). Methods other than genomics are therefore advantageous, particularly mass spectra (MS)-based proteomics strategies as they allow the direct identification of the AMPs and their isoforms. MS analysis of secreted peptides generates reliable and useful information such as the molecular weight, the AA sequence, as well as the length of the FA chain in the case of lipopeptides (Zhao et al., 2014). Furthermore, MS imaging technology has emerged as a powerful tool to not only identify novel AMPs but also locate them in situ (Debois et al., 2013). Nuclear magnetic resonance (NMR) analyses are complementary because they yield structural elucidation such as cyclic structures (Wäspi et al., 1998; Sammer et al., 2009).

Plant protection and resistance against pathogens have been traditionally and are still currently addressed with chemical uses and breeding programs. Nevertheless, the possibilities arising from the study of AMPs will contribute to control the plant pathogens whose virulence is driven by perpetual adaptation through mutation. The aim of this review is to discuss naturally synthesized and secreted AMPs offering biocontrol potential for crop species. In this review, we chose to focus on wellstudied AMPs representative of the main taxa such as bacteria, fungi, animals, and plants. The first section covers the plantsecreted peptides induced as a defense mechanism following pathogen attack or herbivory. The last section focuses on peptides secreted by non-plant species (bacteria, fungi and animals) that are subject to microbial challenges, and the potential of these peptides to help plant species fight fungal diseases by applying a biocontrol approach. **Figure 1** illustrates AMPs described in the review with their mode-of-action where known. **Figure 2** outlines a workflow to isolate, identify and validate AMPs bearing biocontrol potential in plant immunity. We discuss successful cases of increased pathogen resistance in transgenic plants and their potential for increased crop yield. Additional information on transgenic plants expressing AMPs toward improved disease resistance can be found in the reviews from Ramadevi et al. (2011) and Holaskova et al. (2015). Not covered in this review is a promising strategy for synthetic peptides production through combinatorial chemistry which offers an alternative approach to developing AMPs for agricultural applications (López-García et al., 2002; Choi and Moon, 2009; Rebollar et al., 2014).

### SECRETED PLANT RIBOSOMAL PEPTIDES FOR SIGNALING AND DEFENSE

The advances in sequencing technology over the last few decades have led to an increasing number of plant genomes being widely available. The first sequenced plant genome was Arabidopsis thaliana (Intitative, 2000). Analysis of this genome revealed that plants encode many more predicted peptide transporters and receptors than was expected based on animal systems (Intitative, 2000; Shiu and Bleecker, 2003). This implies that many of the plant peptides involved in plant growth and development, cellto-cell communication and defense responses have yet to be identified. Plant AMPs are also a rich source of plant defense compounds which can be grouped based on their structure. The 8 main classes in plant AMPs are cyclotides, lipid transfer proteins, defensins, thionins, snakins, hevein-like, vicillin-like, and knottins (Goyal and Mattoo, 2014). This section will focus on known plant endogenous peptides which are secreted upon recognition of microbial colonization of the plant and aid in plant defense signaling. Some of these have been comprehensively reviewed (Ryan and Pearce, 2003; Boller, 2005; Farrokhi et al., 2008; Yamaguchi and Huffaker, 2011) but updated information will be presented here. The defensins will be looked at in detail as they have been successfully taken through to field trials recently. Other AMPs showing induction of resistance in transgenic plants have also been described previously (Oard and Enright, 2006; Muramoto et al., 2012; Verma et al., 2012; Zhu et al., 2012). These other classes of AMPs with increased resistance have been well reviewed (Goyal and Mattoo, 2014). Where possible the receptors detecting these endogenous peptides will also be discussed. **Table 1** summarizes the plant secreted peptides discussed in this review; AMP precursors and receptors, where known, are indicated.

### Plant Signaling Peptides

### Systemin and Its Receptor: The First Peptide Acknowledged as a Plant Hormone

Systemin is the most well-known secreted plant peptide involved in activation of defense signaling and it was the first peptide to be acknowledged as a plant hormone (Ryan and Pearce, 2003). Systemin is an 18-AA peptide which is derived from a 200-AA precursor protein called prosystemin

#### FIGURE 1 | Continued

constitutive secretion as demonstrated for fusaricidins, and LI-F lipopeptides. (1) AMPs used for transgenic experiments in plants are underlined. (2) Pathogens listed are bacteria (Erwinia amylovora, Pseudomonas syringae pathovars, P. aeruginosa, Serratia marcescens, Bacillus subtilis, Pectobacterium carotovorum, Ralstonia solanacearum, Dickeya dadantii), oomycetes (Pythium irregular, P. dissotocum, P. ultimum, P. aphanidermatum, Phytophthora infestans, P. parasitica, P. nicotianae), fungi (Cochliobolis heterostrophus, Colletotrichum graminicola, C. gloeosporioides, C. higginsianum, Rhizoctonia cerealis, R. solani, Fusarium oxysporum pathovars, F. graminearum, F. culmorum, F. verticillioides, Verticillium dahliae, Blumeria graminis, Botrytis cinerea, Penicillium expansum, P. crustosum, Rhodotorula pilimanae, Rhizopus sp., Alternaria citri, Botryosphaeria sp., Fusicoccum aromaticum, Lasiodiplodia theobromae, Sphaerotheca fuliginea, Bremia lactucae, Cladosporium cucumerinum, Ascochyta citrullina, Sclerotinia homoecarpa, Magnaporthe oryzae, Aspergillus flavus, Sclerospora graminicola), and insects (Spodoptera litura, S. frugiperda, grasshopper).

FIGURE 2 | Diagram for the identification and validation of AMPs bearing biocontrol potential in plant immunity. Secreted AMPs can be recovered from any inter- or extra-cellular space and from any organism. Being dilute, secreted AMPs will need to be concentrated using one or a combination of the methods listed in the Recovery box; prior to their purification AMPs must be solubilized. Purification involves separation steps either using gel-based or gel-free techniques. Purified AMPs can then be analyzed using NMR and/or MS which will yield identification and structural elucidation. The final step in the process is the validation of the novel AMP as a biocontrol agent using bioassays, and ultimately introducing it into the crop by transgenesis to assess the level of pest resistance it confers.

TABLE 1 | Plant ribosomal AMPs.


(Pearce et al., 1991; McGurl and Ryan, 1992; Farrokhi et al., 2008; **Table 1**). The isolation of systemin was achieved by feeding cut Solanum lycopersicum (tomato) stems with a few microliters of tomato leaf juice, causing the plants to produce proteinase inhibitor proteins, including systemin (Pearce et al., 1991; Ryan and Pearce, 1998). Prosystemin localizes in the cytoplasm of phloem parenchyma cells, however upon wounding or death of these cells, prosystemin is processed to systemin, probably by proteinases. This allows systemin to diffuse into the apoplast and be detected by receptors on the plasma membrane of mesophyll cells (Boller, 2005). Systemin induces protease inhibitor production within local tissue in order to suppress wounding by insect proteases. However, it also leads to a systemic wound response by activating the jasmonic acid (JA) signaling pathway (Stratmann, 2003; Farrokhi et al., 2008). The production of this peptide appears to be under diversifying selection which suggests that some herbivores may have evolved defenses against systemin signaling (Boller, 2005). Prosystemin has only been found in species of the Solaneae a subtribe of the Solanaceae family e.g., tomato, potato, pepper and nightshade but not in tobacco, another Solanaceae (Scheer and Ryan, 2002; Ryan and Pearce, 2003). Since systemin has not been identified in tobacco this plant also shows no alkalization response to systemin treatment (0.0025–25 nM) (Ryan and Pearce, 2003). Therefore, it seems systemin is restricted to a few species and is not found globally in plants. This suggests that for the Solaneae plants, pre-treatment with systemin could reduce the wounding effects of insects in the field. Systemin pre-treatment also has the potential to reduce other infections, for example virus and bacterial infections transmitted by insects.

Over a decade after the first report of systemin its tomato cell surface receptor was identified (Scheer and Ryan, 2002). SlSR160 is a 160 kDa receptor belonging to the family of leucine rich repeat receptor-like kinases (LRR-RLKs), and like classical LRR-RLKs, it contains an extracellular leucine rich repeat domain, a transmembrane domain and an intracellular kinase domain. In cell culture experiments, systemin detection by SlSR160 activates a complex signaling pathway which results in activation of a mitogen-activated protein kinase (MAPK), rapid alkalinization of the extracellular medium via blockage of a proton pump in the cell membrane, the activation of phospholipases along with activation of phytodienoic acid and jasmonic acid (JA) for defense signaling (Felix and Boller, 1995; Schaller and Oecking, 1999; Ryan and Pearce, 2003). It was later discovered that the systemin receptor, SlSR160, and the brassinosteroid receptor from A. thaliana, AtBRI1, are homologs which contain all the same domains, the most conserved of which have 83–90% similarity (Montoya, 2002; Scheer and Ryan, 2002). BRI1 is the cell surface receptor for the brassinosteroid pathway which detects the hormone brassinolide (BL), a regulator of plant growth and development. This shows dual function for this receptor in both plant defense and maintenance of growth and development. In recent years however, the dual function of BRI1/SR160 has come into question. Initially dual function was demonstrated by the expression of tomato SR160/BRI1 in tobacco which allowed the transgenic plants to respond to systemin treatment (Scheer et al., 2003). A mutant tomato Slbri1 gene called cu-3, was used to further examine the dual function of SR160/BRI1. The tomato cu-3 plants showed a Brassinosteroid (BR) deficient growth phenotype i.e., stunted growth and curled leaves, however these plants also could not induce systemin signaling (Scheer et al., 2003). Confirming this receptor has a dual role in plant defense as well as in growth and development (Montoya, 2002; Scheer et al., 2003). However, more recently

it was shown that the reduced responsiveness of Slcu3 plants to systemin was due to the stunted nature of these plants and Slcu3 cell cultures responded the same as the wild type (WT) to treatment with systemin (Holton et al., 2007). These results have led to some controversies in this field and a more recent study has sought to try and understand these conflicting results. Malinowski et al. (2009) showed by expression of tomato BRI1 in tobacco cells that BRI1 binds to systemin however the systemin signaling did not show any increase by the overexpression of tomato BRI1. This report confirms results previously described by Holton et al. (2007), that silencing of SlBRI1 in tomato results in a bri1 phenotype but did not affect systemin response (Malinowski et al., 2009). These results imply that BRI1 can bind systemin but it is not the ligand that initiates intracellular signaling (Malinowski et al., 2009). Therefore, there could be another ligand for systemin yet to be identified. The newest data also explains why A. thaliana which contains BRI1 cannot sense systemin treatment or activate downstream defense signals. In A. thaliana there is some redundancy within the BRI1 gene family as three other cell surface receptors have been identified with high sequence similarity, BRI1-like (BRL1, BRL2, and BRL3). It was shown that BRL1 and BRL3 can complement BRI1 but BRL2 cannot, BRL2 is required for provascular differentiation in leaves (Clay and Nelson, 2002; Caño-Delgado et al., 2004). Given there is a gene family in A. thaliana, it is highly likely that this is true in Solanacea species. Therefore, it could be that one of these other receptors is required for systemin signal transduction.

### HypSys: The First Peptides Identified in Tobacco

HypSys's are another group of well-known wound-induced peptides which show functional relatedness to systemin and contain multiple hydroxyproline residues. These polypeptide hormones were first identified in tobacco using cell suspension assays treated with a crude peptide fraction from tobacco leaves which resulted in the alkalinization of medium. Two 18-AA polypeptide hormones were identified in tobacco and found to be derived from each end of a 165-AA polyprotein hormone precursor, pro-TobSys-A (Pearce et al., 2001; **Table 1**). These two peptides were named tobacco hydroxyproline-rich systemin (TobHypSys) I and II (Ryan and Pearce, 2003). Although HypSys's may have similar downstream functions to systemin, it appears the physical properties and the processing of these two peptides differ greatly. The synthesis of HypSys potentially involves the secretory pathway given the presence of carbohydrate residues and hydroxyproline residues, while in contrast systemin contains no hydroxyprolines and no carbohydrate residues (Pearce et al., 2001). HypSys's have also been identified in a wider variety of plants than systemin including tomato (TomHypSys I, II and II), petunia (PhHypSys I, II, and III) sweet potato, black nightshade and potato (Pearce and Ryan, 2003; Pearce et al., 2007, 2009; Chen et al., 2008; Bhattacharya et al., 2013). The TomHypSys are 20, 18, and 15- AAs in length, respectively, while the PhHypSys I, II, and III are 19, 20, and 18-AAs in length, respectively (Pearce and Ryan, 2003; Pearce et al., 2007). In these identified precursor proteins for the HypSys peptides it was noted that a 30 base pair (bp) sequence around the peptidase splice site was conserved in all, this conservation and the high sequence homology over the whole protein is allowing faster and easier identification of these peptides therefore avoiding long troublesome purification from leaf extracts (Bhattacharya et al., 2013).

The 146-AA precursor protein of the three TomHypSys peptides localizes to the cell wall matrix in the phloem parenchyma cells of tomato leaves (Narváez-Vásquez et al., 2005). The tomato, tobacco and black nightshade HypSys's all induce the production of protease inhibitors when supplied through cut petioles. Furthermore, the tomato preproHypSys gene is induced upon wounding of leaves and upon systemin or JA treatment of the leaves (Pearce et al., 2001, 2009; Pearce and Ryan, 2003). In contrast, the petunia HypSys's showed no induction of protease inhibitors against herbivore attacks, however they did induce the expression of defensin 1 (Pearce et al., 2007). Defensin 1 is a gene associated with innate immunity in plants, suggesting that the petunia HypSys's still appear to function in plant defense signaling pathways. The petunia HypSys's are functionally more similar to the AtPep family of peptides than to the other HypSys's implying that there may be some functional diversity within peptide families specific to some plants (Bhattacharya et al., 2013). The StHypSys's are functionally comparable to the tobacco and tomato HypSys peptide, however they appear to have additional functions as they also activate enzymes to protect against oxidative stress and free-radical generation (Bhattacharya et al., 2013). These are both outcomes of herbivore attack and pathogen infection therefore the StHypSys peptides appear to act as defense elicitors against both insects and pathogens (Bhattacharya et al., 2013). Although the experiments were not done it could be expected that pre-treatment of plants with HypSys prior to insect attack or pathogen infection would result in reduced symptoms as the defense pathways are primed.

### Pep1 and Its Receptor: The Model Plant Arabidopsis Gives up Its First Peptide

Pep1 from A. thaliana is another peptide which is associated with the activation of the plant defense system (Huffaker et al., 2006). This peptide consists of 23 AA residues and is derived from the C-terminus of a 92-AA precursor protein called PROPEP1 (Huffaker et al., 2006; Farrokhi et al., 2008; **Table 1**). AtPROPEP1 shows low level expression in all tissue types but increased expression is evident when plants are wounded or treated with methyl jasmonate (MeJA) or ethlylene (ET) (Huffaker et al., 2006). This suggests a role for this protein in plant defense. Plants treated through cut petioles with 20 nM AtPep1, showed increased expression of Plant Defensin 1.2 (PDF1.2), Pathogenesis-related Protein 1 (PR-1) and PROPEP1 along with the production of H2O<sup>2</sup> (Huffaker et al., 2006; Huffaker and Ryan, 2007). The production of H2O<sup>2</sup> and PDF1.2 are both associated with innate immunity in plants while PR-1 is known to be induced during Pathogen-Associated Molecular Patterns (PAMP) Triggered Immunity (PTI) however its function remains unknown. Huffaker et al. (2006) and Huffaker and Ryan (2007) used mutants for the JA and ET pathways, to show that AtPep1 functions upstream of the JA/ET pathways, as expression of PDF1.2 and PR-1, triggered by the peptides, was blocked. Transgenic plants constitutively overexpressing AtPROPEP1 showed an increase in root mass compared to WT plants. When these overexpressing plants were infected with the root pathogen Pythium irregular the increase in root growth compared to infected WT plants was still apparent however, the aerial parts of the plants show no difference in growth or disease symptoms between the transgenic and WT. This implies that overexpression of PROPEP1 provides a growth advantage to the roots even in the presence of a pathogen (Huffaker et al., 2006). This growth advantage looks to be a by-product of PROPEP1 overexpression as the aerial components of the plants do not show increased growth implying that nutrient acquisition has not increased. Six annotated homologs of the precursor protein have been identified in A. thaliana, two of which (PROPEP2 and 3) show increased expression when plants are infected with bacterial, fungal and oomycete pathogens (Huffaker et al., 2006). The expression of PDF1.2 and PR-1 was increased in plants treated with AtPep1 (Huffaker et al., 2006). All of these results indicate that PROPEP1-3 has a role in a feedback loop for defense signaling that is activated in the presence of pathogens and also increases root development (Huffaker and Ryan, 2007).

Orthologs of these A. thaliana genes have since been found in other plant species suggesting an important and ancient role for these peptides. Important crop species are among those to be investigated for the presence of Pep peptides. Recently an ortholog of AtPROPEP1 was identified in Solanum lycopersicum, SlPROPEP, which has high AA identity (96%) with the Cterminal region of the S. tuberosum gene (Trivilin et al., 2014). Silencing of the tomato ortholog increased the susceptibility of plants to Pythium dissotocum but also decreased the expression of key defense response genes e.g., PR-1, PR-5, ERF1, LOX-D, and DEF2 (Trivilin et al., 2014). These results showed that SlPROPEP plays a role in resistance against P. dissotocum. Huffaker et al. (2011) identified and characterized an ortholog of AtPROPEP1 in maize. Similar to AtPROPEP1, ZmPROPEP1 is induced in response to JA or peptide treatment, but also fungal infection. In maize plants treated with ZmPep1 it was noted that the expression of several defense genes were up-regulated, endochitinase A, PR-4, PRms, SerPIN, and Benzoxazineless1 (Huffaker et al., 2011). Interestingly, this work also showed that pre-treatment of maize leaves with 25 pmol of ZmPep1 increased resistance of leaves to Cochliobolis heterostrophus and pretreatment of stalks with 5 nmol of ZmPep1 increased resistance to Colletotrichum graminicola (Huffaker et al., 2011). The work described above clearly shows a benefit to the plants when pretreated with these peptides in reducing pathogenicity. Therefore, these peptides may be useful for crop plants in areas that are known to have high pathogenicity instead of the use of fungicides or pesticides by either foliar application or with the generation of transgenic crop plants.

Two cell surface receptors for AtPep1, PEPR1 and PEPR2, have been identified in A. thaliana and demonstrated to be membrane associated leucine rich repeat (LRR) receptor kinases (Yamaguchi et al., 2006, 2010). These two receptors are 76% similar at the protein level and both showed induced expression upon wounding, also following exogenous application of Pep1 peptide and MeJA (Yamaguchi et al., 2010). PEPR1 was confirmed as recognizing Pep1 resulting in the activation of plant defense responses (Yamaguchi et al., 2006). Gain of function was also observed when AtPEPR1 was overexpressed in transgenic tobacco cell lines resulting in the rapid alkalinization of the media upon treatment with AtPep1 (Yamaguchi et al., 2006). Like previously described experiments, pre-treatment of A. thaliana with exogenous Pep1 resulted in increased resistance to Pseudomonas syringae pv. tomato DC3000 (Yamaguchi et al., 2010). PEPR1 and PEPR2 show limited redundancy as single mutants of each showed reduced resistance to P. syringae pv. tomato while a double mutant is susceptible to P. syringae pv. tomato (Yamaguchi et al., 2010). This work has elegantly identified the cell surface receptor of a secreted plant peptide which induces defense responses against pathogen invasion.

### Soybean Peptides: Soybean is Potentially the Next High Peptide Yielding Plant

Whilst tomato and tobacco have traditionally been used for peptide discovery, soybean is rapidly becoming the new dicot model as a source of new peptides. Recently many new peptide families have been identified in soybean, GmPep914 and GmPep890 are two recently identified peptides shown to cause the rapid alkalinization of cell culture medium (Yamaguchi et al., 2011). These peptides consist of 8 AA residues and are derived from the C-terminal end of 52- AA precursor proteins GmPROPEP914 and GmPROPEP890 (Yamaguchi et al., 2011; **Table 1**). Treatment of soybean leaves with GmPep914 and GmPep890 induced the expression of their precursor proteins as well as defense genes including CYP93A1, a cytochrome P450 gene, chitinaseb1-1 and Gmachs1 (Glycine max chalcone synthase1) which is involved in phytoalexin production (Yamaguchi et al., 2011). GMPep914 is unrelated to any of the previously identified peptides with a known function in plant defense and it is also the smallest peptide found to have such a role (Yamaguchi et al., 2011).

GmSubPep was also recently identified in soybean. The sequence of this 12-AA peptide was found within an extracellular protein family called subtilisin-like protease (subtilase), specifically in the gene Glyma18g48580 (Pearce et al., 2010; **Table 1**). This peptide appears unique for soybean as systemin is for tomato due to the GmSubPep sequence located in the protease-associated (PA) domain of Glyma18g48580 which is a unique region of the subtilases within legumes (Pearce et al., 2010). Also when this peptide was tested on tomato, tobacco, Arabidopsis and corn cell cultures no alkalinization was observed (Pearce et al., 2010). The receptor of Glyma18g48580 was not induced upon wounding or treatment with MeJA, however treatment of cell cultures with the peptide resulted in an increase in some defense genes (Cyp93A1, Chib-1b, and PDR12; Pearce et al., 2010). Glyma18g48580 is predicted to be an apoplastic protein of unknown function and it has been speculated that this protein comes into contact with non-self molecules, due to its apoplastic localization. This in turn could induce cleavage of the peptide resulting in activation of defense signaling pathways. No work has been done so far with these soybean peptides to determine if they are able to increase plant resistance to pathogens or insects but given that defense genes are up-regulated it is conceivable that this is the case.

### CAPE1: Unraveling the Potential Function of PR-1

The CAPE1 peptide, was recently isolated from tomato and has been shown to be a Damage-Associated Molecular Pattern (DAMP) elicitor (Chen et al., 2014). CAPE1 was identified using endogenous peptide mixtures extracted from stressed and unstressed plants, these extracts were analyzed by nanoflow ultrahigh performance liquid chromatography mass spectrometry (nanoUHPLC-MS). CAPE1 is an 11-AA peptide derived from the C-terminus of a 159-AA precursor protein called PR-1b (also known as P14a; Chen et al., 2014; **Table 1**). Expression analysis showed that CAPE1 was induced by wounding and that treatment of tomato plants with the CAPE1 peptide resulted in the induction of defense hormones such as JA and salicylic acid (SA) as well as increased expression of several pathogen-related marker genes, PR-2, PR-7 and PR1b (Chen et al., 2014). However, CAPE1 treated plants did not induce PTIresponsive genes, suggesting that CAPE1 may induce systemic resistance rather than PTI (Chen et al., 2014). The authors also showed that pre-treatment of tomato leaves with CAPE1 prior to infection with P. syringae pv. tomato resulted in reduced disease symptoms in the absence of Hypersensitive Response (HR) (Chen et al., 2014). CAPE1 also showed an anti-herbivore activity; Spodoptera litura larvae feeding experiments using CAPE1 pre-treated tomato leaves resulted in a 20% reduction in size and weight (Chen et al., 2014). This data demonstrates that CAPE1 has antibacterial and anti-herbivore activity. Curiously anti-fungal activity was not reported on. Comparing these pretreatment results to those discussed above for the ZmPep1 peptide (Huffaker et al., 2011) shows that these peptides appear to prime plants for pathogen resistance.

Sequence analysis has shown that the CAPE1 peptide is conserved in many flowering plants. The authors also suggest that three AAs immediately in front of the peptide may be the cleavage site (CNYx) and this could act as a conserved motif that could indicate a bio-active peptide in other species (Chen et al., 2014). Pathogenesis-related protein 1 from A. thaliana (AtPR1) contains the CAPE1 and potential bio-active motifs (CNYx. PxGNxxxxPY) but has the least protein similarity to tomato PR-1b. However, this CAPE1 peptide from AtPR1 was shown to increase immunity of A. thaliana to P. syringae pv. tomato indicating that this peptide functions similarly to the CAPE1 from tomato PR-1b (Chen et al., 2014).

The authors suggest a motif that could be used to search other RNAseq or genome database to investigate whether the motif is used in other peptide cleavage sites, showing a downstream application of this technique. As for the plant resistance increase that is observed when the plants are pre-treated with CAPE1 it gives further information about the PR1 protein which has always been associated with plant defense but a function of this protein has been elusive. Since PR1 is present in most plant species this could be a good biocontrol agent against pathogens and shows an ancient and probably conserved role for this protein.

### Inceptins: Insect Feeding Aids Plant Defense

Inceptins fall into a small category of plant peptides as they are processed by enzymes from the invading organism, not by the plant itself. Inceptins are 11–13 AA acidic peptides which contain disulphide bridges and are derived from the chloroplastic ATP synthase γ-subunit (Schmelz et al., 2006; **Table 1**). These peptides were identified from the saliva of larval fall armyworms (Spodoptera frugiperda) that were feeding on cowpea (Vigna unguiculata) plants. Inceptin peptides are cleaved from the chloroplastic ATP synthase γ-subunit during feeding of the larvae on the leaves; the peptide in the saliva of S. frugiperda is then detected by the plant, resulting in the activation of defense related genes (Schmelz et al., 2006). The perception of inceptin by cowpea plants results in activation of the volatiles homoterpene (E)-4,8-dimethyl-1,3,7-nonatriene (DMNT) and cinnamic acid which are known to attract natural insect enemies but also results in induction of JA and SA hormones (Schmelz et al., 2006, 2007).

### Defensins: Peptides Endogenous to all Plant Species

Plant defensins are produced in all plant species and are either constitutively expressed in storage and reproductive organs or can be induced during biotic or abiotic stress (Vriens et al., 2014). These peptides are known as cationic peptides and are usually 45–54 AAs in length. These peptides typically harbor a cysteine-stabilized αβ-motif (CSαβ) which has an α-helix and a triple-stranded antiparallel β-sheet which is stabilized by four disulphide bridges (Vriens et al., 2014).

There are typically two groups of plant defensins based on the proproteins. Both groups contain a signal peptide which targets the protein to the endoplasmic reticulum where the mature protein is folded and enters the secretory pathway. One of these groups contains an additional prodomain on the C-terminus which is cleaved during processing through the secretory pathway (Vriens et al., 2014). Some well-known plant defensins are RsAFP1 and RsAFP2 from the seeds of radish (Terras et al., 1992; **Table 1**), MsDef1 and MtDef4 from the seeds of Medicago species (Gao et al., 2000; Ramamoorthy et al., 2007; Sagaram et al., 2013), NaD1 from the flowers of tobacco (Lay et al., 2003a,b) and Psd1 from seeds of pea pods (Almeida et al., 2000). Each of these peptides has antifungal activity but so far very few defensins have been found with antibacterial activity. In recent years the use of transgenics has been employed to investigate some defensins for their ability to protect plants from fungal infection or insect wounding. Transgenic rice plants containing a plant defensin from Brassica rapa, BrD1, showed increased resistance to the insect brown plant hopper (Choi et al., 2009). Transgenic tomato plants containing MsDef1 showed increased seedling resistance to F. oxysporum f. sp. Lycopersici (Abdallah et al., 2010). While transgenic tobacco and potato plants containing the defensin NmDef01 from N. megalosiphon showed increased resistance to the oomycete Phytophthora infestans in both glasshouse and field trials (Portieles et al., 2010). The mode-of-action and pathogenicity experiments of two defensins, RsAFPs and NaD1, will be discussed here in more detail.

Raphunus sativus antifungal protein 1 (RsAFP1) and RsAFP2 were first isolated from radish seeds by use of ammonium sulfate fractionation and anion-exchange chromatography and were found to produce 5 kDa peptides which are assembled and stabilized by disulfide bridges (Terras et al., 1992). The RsAFP1 peptide is 44 AAs in length and RsAFP2 is 36 AAs in length (Terras et al., 1992). Within the first 36 AAs of these peptides there is a high amount of conservation with only 2 AAs difference. These changes though are significant and result in a higher positive charge for RsAFP2 compared to RsAFP1. RsAFP2 has a low half maximal inhibitory concentration (IC50) range of 0.4–25 pg/ml when tested against plant pathogenic fungi while RsAFP1 has a larger IC<sup>50</sup> range of 0.3–100 pg/ml. The antifungal activity of RsAFP2 is also more efficacious compared to RsAFP1 in the presence of salts (Terras et al., 1992). RsAFP2 interacts with fungal glucosylceramide (GlcCer) found in the membrane and cell wall of fungal cells (Thevissen et al., 2004). GlcCer is usually associated with other components which form lipid rafts in fungal membranes and cell walls (Vriens et al., 2014). The binding to GlcCer results in activation of MAP kinase and cell wall integrity signaling pathways, the production of Reactive Oxygen Species (ROS), induction of ion fluxes and activation of caspases, resulting in abnormal hyphal growth and division (De Samblanx et al., 1997; Navarro-García et al., 2005; Aerts et al., 2009; Vriens et al., 2014). The ability of RsAFP2 to inhibit fungal infection of whole plants was tested with the generation of stable transgenic wheat constitutively expressing RsAFP2 (Li et al., 2011). Compared to untransformed control plants, the transgenic RsAFP2 plants showed increased resistance to Fusarium graminearum and Rhizoctonia cerealis in glasshouse experiments over several generations and in field trials for T<sup>4</sup> and T<sup>5</sup> generations (Li et al., 2011). The authors also noted that this resistance was heritable (Li et al., 2011). Another well investigated defensin is Nicotiana alata Defensin 1 (NaD1; **Table 1**). NaD1 is one of the rarer plant defensins as it was isolated from the flower of ornamental tobacco (Nicotiana alata) and not the seed, where most defensins have been found. NaD1 is only active against filamentous fungi and has no effect against bacteria, yeast or human cell lines (van der Weerden et al., 2008). The fungal membrane ligand for NaD1 has recently been identified as phospholipid phosphatidylinositol 4,5-bisphosphate (PIP2; Poon et al., 2014). These authors demonstrated that NaD1 oligomers interact with the head group of two PIP2 molecules and that this complex is required for membrane permeabilization and subsequent cell death (Poon et al., 2014). If the NaD1 oligomer structure is disrupted then antifungal activity is also lost (van der Weerden et al., 2008; Poon et al., 2014). Once the fungal cell wall has been disrupted NaD1 has been shown to form an aperture of 14–23 Å in size through which it may enter the cytoplasm (van der Weerden et al., 2008). It has been shown that NaD1 is able to localize to the cytoplasm of hyphal cells and these cells subsequently show granulation of the hyphal cytoplasm and cell death (van der Weerden et al., 2008). This indicates that defensins may not just target the cell membrane, but also have intracellular functions. Once NaD1 is inside the cytoplasm of hyphal cells, ROS production occurred within these cells suggesting that cell death is occurring.

There is still more to be learnt about the mode-of-action of NaD1 but this defensin has great potential as a natural anti-fungal in plants other than tobacco. This has been elegantly demonstrated by Gaspar et al. (2014), where they generated transgenic homozygous cotton lines constitutively overexpressing NaD1. These lines were tested in greenhouse assays for resistance to F. oxysporum f. sp. vasinfectum (Fov) with one line being chosen for field trials in soil naturally infected with Fov and Verticillium dahlia (Gaspar et al., 2014). These greenhouse assays and field trials of NaD1 expressing transgenic plants showed an increase in resistance and cotton production compared to the non-transgenic parental line when Fov and V. dahlia are present in the soil (Gaspar et al., 2014). The NaD1-expressing plants also showed no detrimental agronomic properties compared to the control non-transgenic parental lines. This is a good example of a defensin taken from one plant species and used as a biocontrol agent in an economically important crops species against soil borne fungi.

All of the peptides discussed above have shown a direct correlation with the induction of the plant defense pathways to protect against herbivores and/or pathogens. In some cases experimental data is available to show a decrease in infection of plants when pre-treated with these peptides. This is a small number of plant defense peptides and without doubt there are many more to be identified. As yet undiscovered peptides may be species specific while others could be ancient defense signals active in all higher plants.

### AMPS SECRETED BY NON-PLANT ORGANISMS, AND THEIR POTENTIAL FOR PLANT PATHOGEN RESISTANCE ENGINEERING

Whilst plants have proven (and will continue to do so) a rich source of antimicrobial peptides, non-plant organisms are also subject to pathogen attack and synthesize AMPs as a defense mechanism. These non-plant AMPs can in turn be either sprayed on crops or artificially introduced into crop species to engineer pathogen resistance using transgenic methods. **Tables 2** and **3** summarizes the non-plant AMPs discussed in this review, detailing their class and molecular weight, the analytical methods employed for their identification in the publications cited here, the organisms naturally secreting them, the targeted pathogens and plant hosts.

### Nonribosomal AMPs Identified from Microbes and Tested by Exogenous Application

Biological control of plant pests through the use of natural antagonistic microorganisms producing a vast array of AMPs has emerged as a promising alternative to reduce the use of chemical pesticides. Cyclic lipopeptides are nonribosomal AMPs predominately produced by Bacillus and Paenibacillus spp. and increasing numbers of lipopeptides are being isolated from Pseudomonas spp. as well (Patel et al., 2015). The following section describes the most favorable AMPs for crop disease resistance in such bacteria as well as in fungi.


TABLE 2 | Non-plant

 AMPs.

nonribosomal

(Continued)




### Syringolin, Syringomycin, and Syringopeptin from Pseudomonas sp. bacteria

Syringolin A is a peptide secreted by the Gram-negative bacterium P. syringae pv. syringae. Syringolin A has a ring structure composed of 5-methyl-4-amino-2-hexenoic acid and 3,4-dehydrolysine (Wäspi et al., 1998; **Table 2**). The α-amino group is joined by a peptide bond to a valine linked to another valine via a urea moiety (Wäspi et al., 1998). Syringolin A was recovered from liquid cultures by centrifugation, filtering, ultra-filtration, followed by gel filtration chromatography. The application of HPLC-purified syringolin onto Oryza sativa (rice) detached leaves induced resistance toward the fungal plant pathogen Pyricularia oryza (rice blast disease). Plate assays showed that syringolin A did not directly affect the growth of P. oryzae. Furthermore, no visible phytotoxic effects were observed when applied on detached rice leaves at the highest concentration (0.05 mM). Gene expression analysis of detached leaves sprayed with syringolin A solution and subsequently inoculated with P. oryzae indicated the increased transcript abundance of defense genes (including Pir7b, Pir2, Pir2, and Rir1 mRNAs). Consequently, it was proposed that syringolin A elicits rice defense responses through acquired resistance rather than being directly anti-fungal (Wäspi et al., 1998). Subsequently, syringolin A was trialed in other pathosystems. Triticum aestivum (wheat) detached leaves were sprayed with a syringolin A solution 2 days prior to inoculation with Blumeria graminis f. sp tritici (powdery mildew; Wäspi et al., 2001). As previously observed with rice blast disease, wheat leaves exposed to syringolin A prior to fungal infection appeared more resistant. Curative effects of syringolin A were also reported in this study with whole wheat plants first infected with powdery mildew and then sprayed with a syringolin A solution. Fungal colonization of wheat tissues was arrested if syringolin A was exogenously applied 2 days post-inoculation. This study suggests that the mode-of-action of syringolin A either targets the host cells in a way that maintains host's hypersensitivity or reverses the suppression of host defense imposed by the pathogen (Wäspi et al., 2001).

Cyclic lipodepsipeptides (LDPs) include two other AMPs secreted by P. syringae pv. syringae, syringomycin E (SRE) and syringopeptin 25A (SP25A) (**Table 2**). SRE is formed by nonapeptide lactones acylated with a long-chain 3-hydroxy FA. SP25A is composed of long and highly hydrophobic peptide chain, with a polar lactonized penta- or octa-peptide moiety at the C terminus. It has been speculated that pathogen cell walls would present a natural barrier which impairs peptide anti-microbial activity, therefore making pest cell wall porous would enhance AMPs action (Fogliano et al., 2002). LDPs were extracted from centrifuged liquid cultures followed by acetone precipitation and further fractionated using reverse-phase (RP) HPLC. SRE and SP25A were identified by MS. SRE and SP25A, together with cell wall degrading enzymes (CWDEs) from Trichoderma atroviride, were incorporated into in vitro assays performed on liquid cultures of F. oxysporum, V. dahliae, Botrytis cinerea, Penicillium expansum, Phytophthora infestans, and Rhodotorula pilimanae (Fogliano et al., 2002; **Table 2**). These analyses revealed that while SRE or SP25A alone did not inhibit fungal growth, in the presence of endochitinase and/or glucanase SRE and SP25A

TABLE

3


Animal

ribosomal

AMPs.

prevented spore germination, thus unraveling synergism between LDPs and CWDEs. Furthermore, postharvest assays performed on wounded apple fruits first inoculated with B. cinerea, then with P. syringae, and/or with T. atroviride indicated that the least number of B. cinerea related-wounds were observed when both P. syringae and T. atroviride were combined (Fogliano et al., 2002). This study thus provided support for a synergistic interaction between P. syringae LDPs and T. atroviride CWDEs against a variety of fungi, thereby supporting the hypothesis that, in a biocontrol strategy, the efficacy of AMPs in plant pathogen resistance would be improved if pathogen cell walls were also made vulnerable (Fogliano et al., 2002). This experiment however has not validated which LDPs and CWDEs are responsible for the synergism observed in vivo.

### Fengycins, Iturins, Surfactins, Bacillomycins, Macrolactin, and Mycosubtilin from Bacillus sp. Bacteria

Potent antifungal lipopeptides such as surfactins, iturins and fengycins are produced by some strains of the Gram-positive bacterium Bacillus subtilis such as strain M4 which are beneficial rhizobacteria (Ongena et al., 2005; **Table 2**). B. subtilis strain M4 was also found to secrete various fengycin homologs, with the great majority of them harboring a C16 or C17 FA chain (Ongena et al., 2005). An enriched lipopeptide extract was obtained by Solid Phase Extraction (SPE) of the crude cell-free culture broth of M4 strain eluted using methanol, followed by HPLC separations and ElectroSpray Ionization (ESI)-MS analyses. In vitro assays showed that the lipopeptide-enriched supernatant had a strong inhibitory effect on the growth of F. oxysporum, Pythium ultimum, Rhizoctonia solani, Rhizopus sp., and B. cinerea. The antimicrobial potential of these peptides has also been described in planta. Post-harvest assays on immature apple fruits preconditioned with M4 supernatant or the lipopeptideenriched supernatant showed increased pathogen resistance, mostly attributable to fengycins (Ongena et al., 2005). Similarly, bean seedlings with roots pre-treated with M4 supernatant prior to B. cinerea leaf inoculation displayed reduced disease symptoms. B. subtilis M4 thus shows great potential as biocontrol agent to better manage soilborne, foliar and post-harvest diseases. The lipopeptide modes-of-action, in particular that of fengycins, rely not only on the direct inhibition of the plant pathogen on infected plant organs, but also on an indirect interaction mediated through the host plant via systemic resistance induction. Direct interaction was demonstrated through the disease control provided by treatment of fruits with lipopeptideenriched supernatant and by in situ detection of fengycins in inhibitory amounts. Indirect interaction was mediated by the production of phenolic compounds involved in or derived from the defense-related phenylpropanoid metabolism upon pathogen attack (Ongena et al., 2005).

More recently the same set of secreted lipopeptides, fengycins, iturins, and surfactins, along with bacillomycin, was detected in the Gram-positive bacterium Bacillus amyloliquefaciens strain PPCB004. Bacteria were cultured in liquid medium; lipopeptides were extracted using n-butanol, followed by complete evaporation and resuspension in methanol. Growth assays showed that these peptides had inhibitory properties on the mycelial growth and/or spore germination of various post-harvest fungal pathogens, including Alternaria citri, Botryosphaeria sp., Colletotrichum gloeosporioides, Fusicoccum aromaticum, Lasiodiplodia theobromae, Penicillium crustosum, and Phomopsis perse (Arrebola et al., 2010; **Table 2**). In this study, iturin A demonstrated the strongest inhibitory effect. Post-harvest assays performed on orange fruits inoculated with A. citri and C. gloeosporioides displayed less disease incidence when treated with B. amyloliquefaciens strain PPCB004 either before or after inoculation. The mode-of-action for iturin A has been proposed to disrupt the fungal cytoplasmic membrane, creating transmembrane channels, resulting in the release of vital ions such as K+, thus preventing spore germination and impairing mycelium development (Arrebola et al., 2010).

The lipopeptides bacillomycin D, fengycins A, and B are secreted by B. amyloliquefaciens strain Q-426 (Zhao et al., 2014; **Table 2**). They were recovered from cell-free supernatant of liquid cultures, further filtered and acid precipitated, prior to drying and methanol resuspension. Using two-dimensional (2-D) HPLC separation and tandem MS (MS/MS) analyses their molecular structure was elucidated. The peptide moiety comprised of 7–10 AA residues arranged in a cyclic structure, while the lipid moiety was composed of a chain of 14–17 FAs. Fengycin A purified from strain Q-426 disrupted the germination of spores from the fungal pathogen F. oxysporum f. sp. spinaciae in a dose dependent manner, with complete inhibition above 50 ug/mL. Fengycin A also modified hyphal growth, albeit without affecting cell membrane permeability (Zhao et al., 2014).

Along with iturin A, isoforms of bacillomycin D and of the macrolactin family, macrolactin A, 7-O-malonyl macrolactin A, and 7-O-succinyl macrolactin A were identified by HPLC-MS/MS analyses from the supernatant of centrifuged liquid cultures of B. amyloliquefaciens strain NJN-6 (Yuan et al., 2011, 2012; **Table 2**). This strain was isolated from the root system of a healthy banana plant. Plate assays showed that both purified bacillomycin D isoforms inhibited the hyphal growth of F. oxysporum f. sp. cubense, a banana fungal pathogen. Similarly, all three macrolactins inhibited the growth of Gram-negative pathogenic bacteria Ralstonia solanacearum. The authors demonstrate that the macrolactin activities are maintained after 1 month at room temperature (Yuan et al., 2012); such shelf-life information is valuable for an in-field application strategy. Performing tests on plant tissues infected with these pathogens would have further validated the antifungal activity of these AMPs.

Fengycin A was also secreted by the Gram-positive bacterium Bacillus atrophaeus strain CAB-1 and showed antagonistic effect against airborne plant fungal pathogens B. cinerea and Sphaerotheca fuliginea. Two groups of lipopeptides were identified in strain CAB-1 by MS analyses: fengycins and unknown lipopeptides. CAB-1's lipopeptide extracts obtained by centrifugation of the culture broth followed by acid precipitation yielded no detectable iturin or surfactin compounds (Zhang et al., 2013). The fengycins produced by strain CAB-1 were a mixture of isoforms with various acyl side chain lengths from C15 to C17 (**Table 2**). The C16 isoform of fengycin A was secreted in greater abundance than C15 and C17 isoforms and therefore would contribute more to growth inhibition of cucumber powdery mildew, S. fuliginea, and tomato gray mold, B. cinerea (Zhang et al., 2013).

Mycosubtilin is a nonribosomal cyclic lipopeptide. Surfactin and mycosubtilin were purified from B. subtilis strains BBG131 and BBG125, respectively. Surfactin was produced through an integrated process in a bubbleless membrane bioreactor while mycosubtilin was produced using an overflowing fedbatch process. Subsequent steps of ultrafiltration, diafiltration, evaporation and freeze-dried were required for purification (Deravel et al., 2014). Surfactin and mycosubtilin combined were reported to have dose-dependent synergetic antibiotic effects on lettuce leaves sprayed with the peptide solutions prior to inoculation with its fungal obligate pathogen Bremia lactucae (Deravel et al., 2014; **Table 2**). This data shows that combining several AMPs opens even greater possibilities in terms of biocontrol strategies, yet it also poses challenges as to designing experiments assessing various AMP combinations and engineering plants resistant to microbial pathogens.

### Fusaricidins and LI-F Lipopeptides from Paenibacillus polymyxa Bacterium

Fusaricidins and closely related LI-F lipopeptides are synthesized by Paenibacillus polymyxa and display inhibitory activities against plant pathogens (**Table 2**). These nonribosomal peptides consist of a guanidinylated β-hydroxy FA linked to a cyclic hexapeptide including four D configured-AA residues (Debois et al., 2013). Growth inhibition effects were also observed against a variety of microbes: Rhodotorula aurantica (yeast), F. oxysporum, B. cinerea, Cladosporium cucumerinum (fungi), Phythium aphanidermatum (oomycetes), and P. syringae (Gram negative bacterium). These AMPs were identified using a matrixassisted laser desorption ionization (MALDI) imaging approach. By using a clever experimental design involving the insertion of a sterilized MALDI glass slide coated with indium tin oxide at the bottom of the Petri dish covered with a gelified sterile nutrient medium (Debois et al., 2013). P. polymyxa was streaked over the glass slide and F. oxysporum was inoculated nearby the slide and incubated for 11 days. The glass slide was removed from the Petri dish and completely dried under vacuum prior to MALDI imaging analysis. The authors were able to obtain spectra and fragmentation patterns of the compounds released by strain Pp56 and responsible for the inhibition of F. oxysporum mycelial development (Debois et al., 2013). The antagonistic interaction between fusaricidins and LI-F lipopeptides and F. oxysporum could thus be visualized by acquiring MS spectra of the different antibiotic compounds exhibiting distinct localizations along the slide. A time-course analysis revealed the early secretion of fusaricidin B, a mixture of LI-F05b/06b/08a, and LI-F08b, away from F. oxysporum hyphae, thus suggesting that their production was not triggered by the presence of the fungus. These antibiotics would be readily secreted and would accumulate in toxic amount outside P. polymyxa cells to deter any potential pathogen attack. Indeed their distribution patterns visualized by MALDI imaging coincided with the F. oxysporum hyphal inhibition zone observed on the culture plates (Debois et al., 2013).

### 2-Amino-3-(Oxirane-2,3-dicarboxamido)-Propanoyl-Valine (APV) from Pantoea agglomerans Bacterium

APV was HPLC purified from a polar extract from liquid broth supernatant, analyzed using ESI-MS/MS and NMR, and identified as the main antibiotic compound of Pantoea agglomerans strain 48b/90 (Pa48b; Sammer et al., 2009). Pa48b is a Gram-negative bacterium that was isolated from soybean leaves which showed limited fire blight disease symptoms caused by the bacterium Erwinia amylovora. Using plate assays, APV inhibitory effect was tested against various microbial phytopathogen species, including Agrobacterium tumefaciens, E. amylovora, several P. syringae pathovars, Serratia marcescens, and B. subtilis (**Table 2**). APV successfully inhibited the growth of pathogens on the minimum synthetic medium in a dose dependent manner, but not on the complex medium, likely due to the presence in the latter of N-acetylglucosamine which would compensate for APV inhibitory effects (Sammer et al., 2009). This result suggests that the antagonistic effect of APV and N-acetylglucosamine on phytopathogen colonization depends on nutrient availability. In a follow-up study, the APV biosynthesis gene cluster was analyzed, and located onto a megaplasmid in Pa48b (Sammer et al., 2012). In this cluster, two genes are likely to be involved in APV biosynthesis regulation, whose transcription is tightly coordinated with translation to avoid precursor cytotoxicity. In silico sequence analysis of the APV gene cluster revealed a 99% identity to the diaminopropionate-peptide biosynthesis cluster of P. agglomerans strain CU0119. One of the genes of this cluster, DdaI, is predicted to be a transmembrane efflux pump mediating self-resistance to antibiotics (Sammer et al., 2012). In planta assays were then undertaken in which soybean leaves were first inoculated with P. syringae pv. glycinea prior to applying an APV solution onto the infected wounds. Consistent with the previous study, APV inhibitory effect was dose-dependent, yet only led to minor decrease of the disease symptoms. Therefore, APV was not confirmed to be the key antibiotic factor in the antagonism (Sammer et al., 2012).

### Trichokonins (TK) from Trichoderma pseudokoningii Fungus

The trichokonin (TK) family is composed of three major peptaibols, TKs VI, VII and VIII, produced by the ascomycota fungus Trichoderma pseudokoningii strain SMF2. Peptaibols are characterized by the presence of an unusual AA, aaminoisobutyric acid, a C-terminal-hydroxylated and a Nterminal-acetylated AA. Because of their linear and amphipathic nature, peptaibols can form voltage-dependent ion channels in lipid bilayer membranes. TKs were obtained from solid state fermentation of T. pseudokoningii strain SMF2, followed by gel filtration of the crude extract and preparative HPLC separation. In vitro plate assays showed that TK VI exhibited antimicrobial activities against various pathogenic fungi and oomycetes, Ascochyta citrullina, B. cinerea, F. oxysporum, Phytophthora parasitica, and V. dahlia (Shi et al., 2012, **Table 2**). Toxicity assays using F. oxysporum protoplasts treated with TK VI showed the appearance of ROS, fragmentation of nuclear DNA, along with a change of fungal membrane permeability and disintegration of subcellular structures. The antimicrobial efficacy of TK was also tested on Chinese cabbage (Brassica rapa) leaves inoculated with the bacterial pathogen, Pectobacterium carotovorum subsp. carotovorum (Li et al., 2014). As observed by Shi et al. (2012), TK treatment led to an increase in the production of ROS, along with increased activities of pathogenesis-related proteins, and the activation of SA signaling pathway in the cabbage host. These data show that the TK family can induce cell death of pathogenic fungi and oomycetes as well as induce the activation of plant defense pathways.

### Animal Ribosomal AMPs Validated in Transgenic Plants

Unlike nonribosomal AMPs, ribosomal AMPs offer the great advantage to be synthetised by genes that can be manipulated and inserted into a plant of interest. Only a few ribosomal AMPs underwent functional validation in planta for pest resistance and they all originate from animal species; they are illustrated in this section.

### Penaeidin 4-1 from Shrimp

The antimicrobial peptide Penaeidin 4-1 (Pen4-1) was isolated from Atlantic white shrimp (Litopenaeus setiferus) under pathogen challenge. Pen4-1 is composed of 47 AAs including six cysteine residues forming three disulfide bridges (Cuthbertson et al., 2004; **Table 1**). Pen4-1 was purified using affinity chromatography as follows: pooled L. setiferus haemocyte extracts were concentrated using a SPE and applied to an affinity resin containing a Pen4-1-specific antibody. Pen4-1 can inhibit multiple plant pathogenic fungal species, such as B. cinerea, Penicillium crustosum, and F. oxysporum. Penaeidins harbor a unique two-domain structure, a proline rich N-terminal domain (PRD) and a cysteine-rich domain (CRD) with a stable alphahelical structure, which might have contributed to its broad range of microbial targets, primarily Gram-positive bacteria and fungi. Transgenic lines of a commercial creeping bentgrass (Agrostis stolonifera) displayed enhanced resistance to the fungal pathogens Sclerotinia homoecarpa and R. solani as a result of Pen4-1 ectopic expression (Zhou et al., 2011; **Table 3**). Testing Pen4-1 into other plant hosts of S. homoecarpa and R. solani would confirm its potential as a biocontrol agent.

### Metchnikowin (mtk), Thanatin, and Cecropin a from Insects

Metchnikowin (mtk) is an immune-inducible peptide synthesized in the fat body of Drosophila melanogaster as a 52-AA pre-pro-peptide upon microbial challenges (Levashina et al., 1995; **Table 3**). In vitro assays demonstrated that mtk inhibits the growth of drosophila pathogens, the Gram-positive bacterium Micrococcus luteus and the ascomycete fungus Neurospora crassa. The mtk AA sequence was determined by use of the Edman sequencing method as a proline-rich 26-AA peptide. MS analyses revealed two mtk isoforms of 3025 and 3045 Da, respectively, due to AA substitution (Rahnamaeian et al., 2009; **Table 3**). Low concentrations of mtk inhibited the in vitro growth of the pathogenic fungi Fusarium graminearum and Fusarium culmorum. Transgenic barley plants expressing the D. melanogaster mtk gene in its 52-AA pre-propeptide form under the control of the inducible mannopine synthase (mas) gene promoter were produced. Mas promoter's induction was triggered by wounding, plant growth hormones, as well as fungal infection. Mtk was successfully processed into its mature form and targeted to the apoplastic compartment in transgenic plants. When inoculated with F. graminearum, transgenic plants displayed higher frequencies of typical defense responses such as HR of attacked cells and the development of callose deposition at the cell wall underneath attempted penetration sites. This enhanced plant innate immune response was substantiated by the up-regulation of Pathogenesis-Related genes PR-1 and PR-5 (Rahnamaeian et al., 2009). The activation of systemic acquired resistance (SAR) was confirmed at the transcript level in a more recent follow-up study in which the mtk barley transgenic plants were infected with B. graminis f. sp. hordei and subjected to RT-PCR analyses (Rahnamaeian and Vilcinskas, 2012; **Table 3**). When the transgenic barley plants were infected with F. graminearum, mtk treatment impeded the development of functional haustorium, whose formation is crucial for commencement of biotrophic interaction.

Thanatin is produced by the stinkbug Podisus maculiventri; it consists of 21 AA residues (2.4 kDa) which form an internal disulphide bond important for its antimicrobial activity (Imamura et al., 2010; **Table 3**). A recombinant thanatin gene under the constitutive control of cauliflower mosaic virus 35S (CaMV35S) gene promoter was introduced into rice plants which were then exposed to blast disease (Magnaporthe oryzae). Thanatin was extracted from crude extracts of the transformants, purified using 2-D HPLC (cation exchange chromatography followed by RP chromatography) and its identity confirmed by MS. Although blast disease symptoms were observed on both the transgenic lines and the WT plants, diseased areas on the transformants were significantly smaller than those on the WT plants. While transgenic rice plants were not fully protected against M. oryzae, they had acquired partial resistance (Imamura et al., 2010). Very recently, thanatin gene was introduced into maize and controlled by the ubiquitin-1 promoter which targets the expressed AMP to the plant apoplastic space (Schubert et al., 2015; **Table 3**). Transgenic maize plants were grown until ears were fully developed and the mature ears were then exposed to the pathogenic fungus Aspergillus flavus. The expression of thanatin in maize transformants led to a significant reduction in fungal biomass of A. flavusrelative to that of WT plants (Schubert et al., 2015). The mode-of-action of thanatin in plants has yet to be fully understood.

Cecropin AMPs were first isolated from the haemolymph of the moth Hyalophora cecropia; cecropin A (37 AAs, 4 kDa; **Table 3**) exhibits a rapid, potent and long-lasting lytic activity against prominent bacterial and fungal phytopathogens (Bundó et al., 2014). Rice transgenic plants were obtained in which cecropin A expression was targeted to the seed endosperms by using the tissue-specific promoters Glutelin B1 or Glutelin B4. Seeds from rice transgenic plants exhibited resistance to infection by fungal (Fusarium verticillioides) and bacterial (Dickeya dadantii) pathogens (Bundó et al., 2014). The mode-ofaction of cecropin A in rice seeds remains to be investigated.

### Magainin-2 (mag) and Esculentin-1 (Esc28L) from Frogs

Magainins are ribosomal AMPs produced in the skin of Xenopus frogs. Growth assays in the presence of the 23-AA magainin-2 showed that the peptide inhibited the growth of various Gram-positive and Gram-negative bacteria, a few fungi and protozoa (Zasloff, 1987). The mag gene encoding for the MAGAININ-2 protein was ectopically over-expressed in pearl millet (Pennisetum glaucum) under the control of the constitutive CaMV 35S gene promoter and challenged with three strains of downy mildew (Sclerospora graminicola viz. Sg 384, Sg 445, and Sg 492; Ramadevi et al., 2014; **Table 3**). While some pearl millet transgenic lines exhibited a slight decrease in disease incidence 7 days post inoculation relative to the controls, none demonstrated full resistance (Ramadevi et al., 2014). Performing statistical analyses however would have helped test the significance of these results. The authors attribute the minor response of their genetically modified plants to the complexity of cell wall and cell membrane components of the oomycete pathogen, which are similar in composition and structure to those of host plant cells. Perhaps combining MAGAININ-2 with cell wall degrading enzymes, as tested by Fogliano et al. (2002) whom combined SRE or SP25A peptides with endochitinase and/or glucanase, would prove an efficient strategy.

Esculentins are highly potent AMPs of 46 AAs exclusively found in the skin secretion of the frog Rana esculenta. Esc28L is a variant of esculentin-1b artificially created by substituting the Methionine residue at position 28 with a Leucine and an additional Methionine in position 47 (Ponti et al., 1999). The idea of using variant peptides to increase plant pathogen resistance while targeting them to the extracellular space was first explored by Ponti et al. (1999, 2003; **Table 3**). Extracellular targeting not only eliminates potential toxicity of the variant AMP toward host cells, while permitting direct contact with pathogens growing and multiplying in the extracellular space. In a follow-up study, Esc28L was fused to the Signal Peptide (SP) sequence of Phaseolus vulgaris EndopolyGalacturonase-Inhibiting Protein (PGIP) to target Esc28L to the secretory pathway. Transgenic tobacco (Nicotiana tabacum) lines were produced in which the constitutive expression of Esc28L conferred enhanced resistance to the bacterial pathogens of tobacco, P. syringae pv. tabaci and Pseudomonas aeruginosa, as well as against the fungal pathogen Phytophthora nicotianae, and moreover demonstrated insecticidal effect against drosophila (Ponti et al., 2003).

### Cathelicidin LL-37 Variant (Met37Leu) and Archaic Wallaby AntiMicrobial (WAM) Peptide from Mammals

The human cathelicidin antimicrobial protein hCAP18 is synthesized in neutrophils as a preproprotein which comprises a conserved cathelin prodomain, and a non-conserved C-terminal peptide. The latter is enzymatically cleaved after secretion forming LL-37, a 37-AA functional antimicrobial peptide. Inspired by the work of Ponti et al. (2003), a mutated variant of LL-37 was created by substituting the Methionine residue at position 37 with a Leucine (LL-37 Met37Leu) and was fused to the SP sequence of PGIP to elicit secretion (Jung et al., 2012; **Table 3**). LL-37 Met37Leu was then overexpressed in Chinese cabbage plants. Transgenic lines were subsequently inoculated with various bacterial (P. carotovorum subsp. carotovorum) and fungal (F. oxysporum f. sp. lycopersici, Colletotrichum higginsianum, R. solani) pathogens. The transgenic plants displayed enhanced pathogen resistance with decreased disease symptoms relative to the controls (Jung et al., 2012). No modeof-action was proposed in this study but since LL-37 Met37Leu is targeted to the extracellular space of the transgenic plants, it could be assumed that a direct interaction occurs between the peptide and the pathogen.

Mammal cathelicidins as a whole are worth investigating for plant engineering programs aiming at improving pathogen resistance. Fourteen, twelve, and eight divergent cathelicidin genes were identified in the wallaby, possum and platypus genomes, respectively. Of these, the proteins WAM 1 and WAM2, and Platypus AntiMicrobial (PAM) 1 and PAM2, were tested against various bacterial and yeast pathogens, and shown to be much more potent than LL-37 (Wang et al., 2011). A phylogenetic approach was then used to design an archaic WAM predicted to have originated 59 million years ago and ancestral to the major clade of marsupial AMPs including the modern WAM1 and WAM2 (Wang et al., 2011). In theory, it should be more difficult for the modern pathogens to overcome plant resistance mediated by an archaic AMPs.

### CONCLUSIONS AND PERSPECTIVES

Secreted peptides with antimicrobial activities are proving useful as biocontrol agents in agriculture in order to increase crop yields by minimizing quantity and quality losses due to pathogenic diseases. Many of the peptides discussed above have been used to create transgenic plants which showed increased resistance to pathogens in in vivo laboratory based experiments, for example Pep1, systemin, mtk, and Met37Leu.

Recently, field trials of transgenic cotton plants expressing the tobacco (Nicotiana alata) peptide NaD1, which targets PIP2 in the fungal membrane in order to cause membrane permeabilization, showed increase resistance to fungal pathogens (Gaspar et al., 2014). This peptide was previously shown to have antifungal activity against a variety of filamentous fungi in vitro and this translated well to in vivo field trials using transgenic cotton plants against F. oxysporum f.sp. vasinfectum and V. dahlia (Lay et al., 2003a; Gaspar et al., 2014). Among the peptides reviewed here many have shown some form of inhibition of pathogenic fungi, bacteria and insects. However, in many cases this inhibition was shown in vitro by addition of the peptide to the culture medium or in vivo by spraying of plant leaves or addition of the peptide to cut petioles or in soil; however all of these approaches are small scale, laboratory based and mainly use model plants (i.e., non crop species). For the majority of these peptides no work has been conducted to test the effect of priming plants on normal growth and development in crop species. In the case of the NaD1 transgenic cotton it was noted that no detrimental agronomic properties were observed in field trials whilst in contrast transgenic potato overexpressing the DF2 defensin altered plant development (Stotz et al., 2009). It should be noted that priming could potentially negatively impact crop performance as energy may be diverted from growth and development of the plant to the defense system. However, this was not the case for the NaD1 transgenic cotton, therefore only more field trials will be able to determine if this is a genuine concern. Moreover, without incorporation of the peptide into the genome as in the case of NaD1 transgenic cotton, crops would potentially have to be sprayed with the peptide of choice as they are with fungicides and pesticides at the moment. The subsequent cost for production and application should be economically assessed in comparison with the current chemical formulation used in fields, along with the impact on the environment.

The discovery of new valuable secreted peptides is associated with the advancing technology of MS, in particular in combination with HPLC separation, which is able to yield high throughput identification and structural characterization. The approach taken by Chen et al. (2014), for identification of the CAPE1 peptide, is an efficient way to use proteomics and MS to identify novel ribosomal and nonribosomal peptides. This technique allows a comparison between different stress conditions to investigate when peptides are induced, e.g., healthy plants vs. wounded, infected etc. Not only does this approach identify peptides but it also gives sequence information for the peptides which could help with subsequent identification of the proprotein. As data-rich as LC-MS experiments can be, they usually require time-consuming, labor-intensive extraction and separation methods. Furthermore, LC-MS does not typically provide information on the localization of such compounds. Therefore, MALDI imaging technology offers a promising alternative as it not only allows the identification of AMPs but also their in situ tissue localization. MALDI imaging has successfully been applied in vitro (Debois et al., 2013). Coupled with traditional histology, MS imaging informs on cellular localization with a resolution down to 5µm of not only proteins, and peptides like AMPs, but also metabolites such as lipids, sugars, in a multiplex fashion within the very same tissue section simultaneously (Aichler and Walch, 2015). Such MS imaging methods would be greatly advantageous if directly applied on diseased plants organs, as compounds co-localizing with visible symptoms would be potential targets for the discovery of novel AMPs.

A crucial step in validating the efficacy of AMPs toward plant disease resistance involves introducing such AMPs into the crop of interest and exposing the transformed crop to its pathogens. To our knowledge, such transgenic experiments have only been attempted with ribosomal or artificially designed AMPs. This arises from the fact that transgenic plants are produced by introducing a foreign piece of DNA which then goes through the transcription and translation machinery to synthesize the AMP. Creating transgenic plants by introducing nonribosomal AMPs, which by definition do not undergo ribosomal synthesis, remains a complete challenge. Perhaps an alternative would be to introduce the enzymes responsible for the synthesis of these nonribosomal AMPs, the so-called NRPS, into the crop in order to acquire disease resistance. Using a synthetic biology approach it has been shown that NRPSs can be genetically engineered to improve the antibacterial properties of a lipopeptide and expand its spectrum against human pathogens (Nguyen et al., 2010). We could not find any reports in the literature related to plant resistance against pathogens. However, the link between NRPSs from biocontrol agents and host crops was recently established between the mycoparasite and facultative root symbiont Trichoderma virens and maize plants (Mukherjee et al., 2012). The analysis of the loss-of-function mutants of T. virens revealed that a hybrid enzyme polyketide synthase (PKS)/NRPS, Tex13, was involved in up-regulation of the defense gene phenylalanine ammonia-lyase (pal) in maize; Tex13 was more than 40-fold induced during interactions of T. virens with maize roots (Mukherjee et al., 2012).

In the evolutionary arms race between plant hosts and pathogens, usually the pathogen wins due to shorter generation times which make them more dynamic than crops, thereby rapidly overcoming plant immunity. Therefore, any tactic that would give the plant host the upper hand would be of immense interest to agricultural programs. One strategy could exploit the fact that different antibiotic peptides can act in synergy, therefore introducing several AMPs in crops would broaden the spectrum of plant disease resistance. Along with the AMPs, enzymes targeting microbial cell wall could also be introduced to the crop of interest to facilitate the entry of secreted AMPs across pathogen physical barriers. Another strategy would be exploring ancient and extinct peptides as archaic AMPs might be more effective than the modern AMPs found in living creatures because microbial pathogens have not been exposed to them for millions of years. Therefore, the modern pathogen would not have developed resistance against the archaic peptide. Engineering crops expressing such archaic AMPs would not only help achieve disease resistance but also slow down becoming overcome by the targeted pathogens (Wang et al., 2011). A last promising strategy is to use de novo-designed synthetic peptides. It is out of the scope of this review which focuses on naturally occurring AMPs, yet special mention should be made of two recent reports in which crops transformed with genes coding for synthetic peptides displayed acquired pathogen resistance (Nadal et al., 2012; Zeitler et al., 2013).

### FUNDING

SB is funded by EU grant DP120103558 and PSS is funded by Australian Research Council grant FT110100698.

### ACKNOWLEDGMENTS

The authors would like to thank Prof. Ben Cocks for fruitful feedback.

### REFERENCES


radish (Raphanus sativus L.) reveals two adjacent sites important for antifungal activity. J. Biol. Chem. 272, 1171–1179. doi: 10.1074/jbc.272.2.1171


tomato following infection by pathogens. Plant Pathol. 63, 1110–1118. doi: 10.1111/ppa.12190


oxysporum f. sp. spinaciae. J. Basic Microbiol. 54, 448–456. doi: 10.1002/jobm.201200414


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Breen, Solomon, Bedon and Vincent. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Large Family of *AvrLm6*-like Genes in the Apple and Pear Scab Pathogens, *Venturia inaequalis* and *Venturia pirina*

*Jason Shiller1, Angela P. Van de Wouw2, Adam P. Taranto1,3, Joanna K. Bowen4, David Dubois2, Andrew Robinson1,5, Cecilia H. Deng4 and Kim M. Plummer1\**

*<sup>1</sup> Animal, Plant and Soil Sciences Department, AgriBio, AgriBiosciences Research Centre, La Trobe University, Melbourne, VIC, Australia, <sup>2</sup> School of BioSciences, University of Melbourne, Parkville, VIC, Australia, <sup>3</sup> Plant Sciences Division, Research School of Biology, The Australian National University, Canberra, ACT, Australia, <sup>4</sup> The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand, <sup>5</sup> Life Sciences Computation Centre, Victorian Life Sciences Computation Initiative, Melbourne, VIC, Australia*

#### *Edited by:*

*Stéphane Hacquard, Max Planck Institute for Plant Breeding Research, Germany*

### *Reviewed by:*

*David L. Joly, Université de Moncton, Canada Isabelle Fudal, Institut National de la Recherche Agronomique, France*

> *\*Correspondence: Kim M. Plummer k.plummer@latrobe.edu.au*

#### *Specialty section:*

*This article was submitted to Plant Biotic Interactions, a section of the journal Frontiers in Plant Science*

*Received: 14 August 2015 Accepted: 26 October 2015 Published: 17 November 2015*

#### *Citation:*

*Shiller J, Van de Wouw AP, Taranto AP, Bowen JK, Dubois D, Robinson A, Deng CH and Plummer KM (2015) A Large Family of AvrLm6-like Genes in the Apple and Pear Scab Pathogens, Venturia inaequalis and Venturia pirina. Front. Plant Sci. 6:980. doi: 10.3389/fpls.2015.00980*

*Venturia inaequalis* and *V. pirina* are Dothideomycete fungi that cause apple scab and pear scab disease, respectively. Whole genome sequencing of *V. inaequalis* and *V. pirina* isolates has revealed predicted proteins with sequence similarity to AvrLm6, a *Leptosphaeria maculans* effector that triggers a resistance response in *Brassica napus* and *B. juncea* carrying the resistance gene, *Rlm6*. *AvrLm6*-like genes are present as large families (*>*15 members) in all sequenced strains of *V. inaequalis* and *V. pirina,* while in *L. maculans,* only *AvrLm6* and a single paralog have been identified. The *Venturia AvrLm6*-like genes are located in gene-poor regions of the genomes, and mostly in close proximity to transposable elements, which may explain the expansion of these gene families. An *AvrLm6*-like gene from *V. inaequalis* with the highest sequence identity to *AvrLm6* was unable to trigger a resistance response in *Rlm6*-carrying *B. juncea*. RNA-seq and qRT-PCR gene expression analyses, of in planta- and *in vitro*-grown *V. inaequalis,* has revealed that many of the *AvrLm6*-like genes are expressed during infection. An *AvrLm6* homolog from *V. inaequalis* that is up-regulated during infection was shown (using an eYFP-fusion protein construct) to be localized to the sub-cuticular stroma during biotrophic infection of apple hypocotyls.

#### Keywords: effector, Dothideomycete, gene families, RIP, WGS

### INTRODUCTION

Effectors are generally small proteins secreted by plant pathogens that can interact with the host during infection. They may serve to facilitate infection, often by manipulating the host, or may be recognized by host receptor proteins, either directly or indirectly, leading to a resistance response (Chisholm et al., 2006; Jones and Dangl, 2006; Dodds and Rathjen, 2010). The outcome of the plant–pathogen interaction depends on the evolutionary context; the pathogen is under selection pressure to evade detection by the host, while the host must maintain the ability to detect the pathogen. These gene-for-gene relationships are at play in the host specificity of *Venturia inaequalis* races to different *Malus* cultivars, where so far 17 gene-for-gene relationships have been described (Bus et al., 2011). While two resistance (*R*) genes governing race-cultivar specificity have been identified in *Malus* (Belfanti et al., 2004; Schouten et al., 2013) none of the cognate fungal avirulence genes have been isolated. Genomic (BioProject ID PRJNA261633 for *V. inaequalis* and JEMP01000000 for the *V. pirina* genome) and transcriptomic (Thakur et al., 2013) sequences are now available for several isolates of *V. inaequalis* and a *V. pirina* isolate, providing a promising new avenue of bioinformatic discovery to identify effectors; however, identifying specific race-governing effectors of *V. pirina* and *V. inaequalis* among these large predicted secretomes remains a challenge. Most effectors identified to date, in other fungi, are small, secreted proteins (SSPs) with high cysteine content (Stergiopoulos and de Wit, 2009) and *in silico* prediction of effector candidates has been informed by these attributes (Saunders et al., 2012; Guyon et al., 2014; Sperschneider et al., 2015). Effectors are often race- or speciesspecific (Stergiopoulos and de Wit, 2009); however, it has been reported that some effectors can be more widely conserved, even across genera as shown with the *Cladosporium fulvum* effector *Avr4* and its homolog in *Mycosphaerella fijiensis,* both of which induce *R*-gene (Cf-4)-mediated resistance responses on tomato (Stergiopoulos et al., 2010). More recently, an *Avr4* homolog from *Dothistroma septosporum* was also shown to trigger a hypersensitive response (HR) when transiently expressed in Cf-4 transgenic *Nicotiana tabacum* (Mesarich et al., 2015).

AvrLm6 is a secreted, proteinaceous effector of *Leptosphaeria maculans,* the causal agent of canola black leg disease, that has been shown to trigger a resistance response on host plants carrying the *Rlm6* resistance gene (Fudal et al., 2007). *AvrLm6* was first identified genetically, with the phenotypic observation of segregation of differential resistance responses of *Brassica napus* and *B. juncea* cultivars to the progeny of *B. juncea*-virulent and -avirulent *L. maculans* isolates (Balesdent et al., 2002). Subsequently, the gene encoding AvrLm6 was identified using map based cloning. Southern blot analyses led to the conclusion that *AvrLm6* homologs were restricted to *L. maculans* (Fudal et al., 2007). Since then, *AvrLm6* homologs have been detected in the whole genome sequences of additional *Leptosphaeria* species. It is likely that these were not identified in the Southern blot analysis by Fudal et al. (2007) due to the low nucleotide sequence identity. Homologs are also found in more distantly related fungi, such as *Colletotrichum* and *Fusarium* species (Grandaubert et al., 2014), as well as in *Venturia* species (Bowen et al., 2011). Nothing is known about the function of AvrLm6 or orthologous proteins in any of these fungi. However, it has been demonstrated that two other *L. maculans* avirulence genes; *AvrLm4-7* (Huang et al., 2006) and *AvrLm1* (Huang et al., 2010) contribute to fungal fitness. Expression of *Avrlm6* is highly upregulated in *L. maculans* during primary infection of *B. napus,* compared with *in vitro* growth (Fudal et al., 2007; Van de Wouw et al., 2010). Expression levels were recorded to be highest at 7 days post inoculation on canola and then, after falling slightly, are maintained at high levels during stem necrosis. Similar expression patterns are seen in other *L. maculans* effectors *AvrLm1, AvrLm4-7,* and *AvrLm11* (Gout et al., 2006; Parlange et al., 2009; Balesdent et al., 2013). *Avrlm6* has been observed at high frequency among *L. maculans* populations in Europe, in the absence of *Rlm6* resistance (Balesdent et al., 2006; Stachowiak et al., 2006), but when selection pressure from *Rlm6* resistance has been applied in experimental trials, the resistance was overcome after just 3 years (Brun et al., 2000). *AvrLm1* (Gout et al., 2007) and *AvrLm4-7* (Daverdin et al., 2012) alleles have also been reported at high levels in the absence of selection pressure from the cognate resistance genes in the field, but, as seen with *AvrLm6,* virulent alleles increase rapidly once resistance genes are introduced, despite the fitness costs to the pathogens (Huang et al., 2006, 2010).

A diverse array of mechanisms have been shown to generate new virulent *AvrLm6* alleles; including deletion, point mutation, and repeat induced point mutation (RIP; Fudal et al., 2009; Van de Wouw et al., 2010). RIP is a type of mutation, that only occurs in fungi, during sexual crossing and alters duplicated DNA sequences and produces CpA to TpA and TpG to TpA mutations (Cambareri et al., 1989). RIP can also impact sequences neighboring duplicated DNA sequences, such is the case with *AvrLm6* in *L. maculans* where RIP has "leaked" from nearby repetitive DNA sequences into the *Avrlm6* coding sequence (Van de Wouw et al., 2010; Rouxel et al., 2011). *AvrLm6* inhabits a gene poor, AT-rich region of the *L. maculans* genome and is surrounded by large numbers of repetitive sequences comprising predominantly long terminal repeat (LTR) retrotransposons (Pholy, Olly, Polly, and Rolly), which appear to have been RIP-affected (Gout et al., 2006; Fudal et al., 2007).

Research into the *AvrLm6*-like genes from *V*. *inaequalis* (*ALVi*s) and *V. pirina* (*ALVp*s) was conducted to determine a possible role for these SSPs in scab disease development. The predicted secretomes and WGS of a single *V. pirina* and 4 *V. inaequalis* strains were mined for sequences with similarity to *AvrLm6* and their surrounding genomic contexts were examined to further understand the evolution and nature of the gene family expansion. The large size of the gene families of *AvrLm6*-like genes and probable functional redundancy hindered the possibility of gene disruption or silencing to determine function. Timing and location of *AvrLm6*-like gene expression was investigated to further understand the role of these proteins in *Venturia*. An *L. maculans* isolate lacking the avirulent *AvrLm6* allele was also transformed with a *V. inaequalis ALVi* gene to determine whether the phenotype of avirulence on the *Rlm6-*containing *B. napus* could be restored (i.e., by complementation of the heterologous gene) indicating conservation of function.

### MATERIALS AND METHODS

### *Venturia* Isolates and Genomic Resources used in This Work

The genomic resources used in this study are detailed below in **Table 1**. The *V. inaequalis* Vi1 genome was used to investigate the genomic environment relating to the *ALVi* gene family as it is the genome that is annotated the most completely, and


TABLE 1 | Genome sequence summary statistics for each *Venturia* isolate included in this analysis.

we have transcriptome data for various time points and growth conditions [BioProject (ID PRJNA261633): *in vitro*: SRR1586226, 2 dpi *in planta*: SRR1586224, 7 dpi *in planta*: SRR1586223J] which has been used to inform gene prediction.

The *V. inaequalis* Vi1 and the *V. pirina* genomes (Cooke et al., 2014) are publicly available via the MycoCosm genome portal at JGI1*,*<sup>2</sup> . All *V. inaequalis* and *V. pirina* isolates used in this study have been reported previously (Stehmann et al., 2001; Le Cam et al., 2002; Win et al., 2003; Bowen et al., 2009; Broggini et al., 2011; Cooke et al., 2014; Caffier et al., 2015). The rest of the genomes were only used to extract the ALVi predicted protein sequences, these are included in the Supplementary Material.

### Identification of AvrLm6 Homologs in Public Sequence Databases

The NCBI (National Centre for Biotechnology Information3 ) non-redundant (nr) public sequence database was queried with the AvrLm6 (GenBank: CAJ90695) predicted protein sequence using PSI-Blast with default settings and an e-value cut-off of 1E-5. Predicted proteins returned from the PSI-Blast search were collected and used in blastp analysis of protein databases at JGI (Joint Genome Institute4 ) and the Fungal Genomes proteins available at the Broad Institute5 also using 1E-5 *e*-value cut-off. In order to identify homologs which had not been predicted, tblastn searches were performed, with the same stringency as above.

### Identifying *ALVp* and *ALVi* Gene Families

Two criteria were used to decide which genes to include in the *ALVi* and *ALVp* gene families. Firstly, standalone PSI-Blast (blast + 2.2.29) was used with AvrLm6 (GenBank: CAJ90695) protein sequence as the query with an *e*-value cut-off of 1E-5 against the protein catalogs of each *Venturia* isolate. The predicted proteins identified that met this first criterion were then used in back-blast searches to screen the NCBI (National Centre for Biotechnology Information3 ) nr public sequence database. If the *e*-value relating to the most similar protein was less than, or equal to 1E-2 to AvrLm6 or one of its homologs from another fungal species (defined above) the sequence was considered an ALVi or ALVp. All ALVp and ALVi amino acid sequences are included in the Supplementary Material (Data Sheet 1).

### Phylogenetic Analysis

Predicted amino acid sequences of all *ALVi* and *ALVp* genes from *V. inaequalis* isolate Vi1 and *V. pirina* isolate 11032 were aligned using MUSCLE multiple sequence alignment implemented in the MEGA6 program (Tamura et al., 2013) and a maximum likelihood tree was derived from the alignment using the WAG substitution model tested with bootstrapping (500). Another tree was constructed in an identical manner with the inclusion of amino acid sequences from additional species including; AvrLm6 (GenBank: CAJ90695) and other homologs from; *L. maculans* (GenBank:XP\_003843096.1), *Colletotrichum gloeosporioides* (GenBank:XP\_007281214.1 and XP\_007275717.1), *C. higginsianum* (GenBank:CCF39162.1), *C. obliculare* (GenBank:ENH83850.1 and ENH87246.1) and *Fusarium oxysporum* (GenBank:EXK76127.1 ) using a WAG substitution model and 1000 bootstrap tests. The trees were drawn with Figtree V1.4.2.

### Quantification of Gene Expression of Selected *ALVis* in *V. inaequalis* Isolate Vi1

Expression analysis of eight *ALVi* genes from Vi1 (*ALVi\_Vi1\_4, ALVi\_Vi1\_5*, *ALVi\_Vi1\_7*, *ALVi\_Vi1\_9*, *ALVi\_Vi1\_14*, *ALVi\_Vi1\_15, ALVi\_Vi1\_17 and ALVi\_Vi1\_22*) was carried

<sup>1</sup>http://genome*.*jgi*.*doe*.*gov/Venin1/Venin1*.*home*.*html

<sup>2</sup>http://genome*.*jgi*.*doe*.*gov/Venpi1/Venpi1*.*home*.*html

<sup>3</sup>http://www*.*ncbi*.*nlm*.*nih*.*gov/

<sup>4</sup>http://www*.*jgi*.*doe*.*gov/

<sup>5</sup>http://www*.*broad*.*mit*.*edu/

out by qRT-PCR out on a LightCycler-R 480 instrument (Roche) using LightCycler-R 480 SYBR Green I Master reagents (Roche). Each 10 μl reaction contained 5 μl of mastermix (2X) 0.5 μl of each primer (at 5 μM) and 4 μl of cDNA (see below). The PCR cycling conditions were as follows; Initial denaturation for 5 min 95◦C, followed by 45 cycles of 95◦C 10 s, 60◦C 10 s, and 72◦C 8 s. Primers used in this experiment are detailed in the Supplementary Table S2. β-tubulin and the 60 s ribosomal L12 gene were used as reference genes as previously described (Kucheryava et al., 2008). These *ALVi*s were chosen for qRT-PCR analysis based on preliminary RNA-seq data which showed they were up-regulated during infection when compared with growth *in vitro*. Three biological replicates were included for each time point and each biological replicate was triplicated for technical replicates.

RNA used in qRT-PCR experiments was extracted from Vi1 cultures growing on cellophane amended potato dextrose agar (PDA) plates 10 days post inoculation to test *in vitro* gene expression and from detached leaf infections at 3, 7, and 14 days post inoculation (dpi). These infections were performed as previously described (Win et al., 2003). RNA was extracted from all samples following a published protocol (Chang et al., 1993).

To grow apple leaves for the detached leaf assays, apple seeds were sourced from open pollinated *Malus* × *domestica* 'Royal Gala.' Surface-sterilized seeds were stratified at 4◦C in watersaturated vermiculite for 6–12 weeks to enable germination. Germinated seeds were planted in compost at 21◦C with a 12 h light period/day under 4C W SHP-TS lights (Sylvania, Danvers, MA, USA) inoculations were performed between 4 and 6 weeks after planting in compost.

### Localisation of an ALVi Protein during Infection

The *ALVi\_Vi1\_5* gene from *V. inaequalis* Vi1 was chosen for localisation analysis using fluorescent protein (YFP) tagging. The enhanced yellow fluorescent protein (eYFP) gene had been adapted for *V. inaequalis* codon usage. The *ALVi\_Vi1\_5* gene was chosen as preliminary RNA-seq data and qRT-PCR analysis showed that its expression was up-regulated during infection, compared with growth *in vitro*. The PJK4 plasmid containing a fusion of ALVi\_Vi1\_5:eYFP (PJK4: ALVi\_Vi1\_5:eYFP) was used as a binary vector for *Agrobacterium*-mediated transformation of *V. inaequalis* isolate Vi1 (Fitzgerald et al., 2003). The expression cassette was constructed with overlap extension PCR (Heckman and Pease, 2007), joining the ALVi\_Vi1\_5 predicted promoter (1003 bp upstream of start codon) and coding sequence in frame with the eYFP gene and the predicted ALVi\_Vi1\_5 terminator (1001 bp downstream of the stop codon), and cloned into PJK4 at the *Spe*1 and *Bgl*2 cloning sites. The predicted promoter and terminator contained no predicted open reading frames. Primers used in the construction are detailed in the SupplementaryTable S3. The sequence of the insert was validated by Sanger sequencing (Australian Genome Research Facility).

To observe localization of the fluorescent fusion protein *in planta*, apple hypocotyls were inoculated with 5 μl droplets of spore suspension (1 <sup>×</sup> 105 spores/ml) prepared from PJK4: ALVi\_Vi1\_5:eYFP transformants or the wild type, untransformed Vi1 isolate. Infected hypocotyls were incubated at 20◦C in darkness for up to 14 days. PDA plates amended with cellophane membranes were also inoculated with the spore suspensions and stored under the same conditions. Infections were observed at 2, 7, and 14 dpi using a Leica TCS SP2 Confocal Microscope (Leica Wetzlar Germany) using the 100× magnification, oil immersion lens. Hypocotyls were grown from *Malus* × *domestica* 'Royal Gala' seed as described above for detached leaf assay.

### Complementation Assays using *ALVi-Vi1\_8* Transformed into an *L. maculans* Virulent Isolate

A complementation assay was carried out to see if *L. maculans* expressing *ALVi\_Vi1\_8* driven by the *AvrLm6* promoter could confer avirulence toward the *Rlm6* containing *B. napus* cultivar 'Aurea' which is resistant to *L. maculans* isolates expressing the *AvrLm6* gene. *ALVi\_Vi1\_8* was chosen because it had the most similar predicted sequence to AvrLm6 based on the blastp analysis (32% identity). The complementation cassette contained *ALVi\_Vi1\_8* (477 bp) flanked by regions upstream (955 bp) and downstream (1255 bp) of *AvrLm6* amplified from *L. maculans* isolate v23.1.3. This fragment was constructed using overlap extension PCR (Heckman and Pease, 2007) and cloned into the binary vector pZP-Nat containing the *nourseothricin acetyltransferase* gene (Elliott and Howlett, 2006). *Agrobacterium*-mediated transformation (Gardiner and Howlett, 2004) was used to transform *L. maculans* isolate M1, which lacks the *AvrLm6* gene, with the construct. Pathogenicity testing were performed on cotyledons of *B. napus* 'Westar'(lacking *Rlm6*) and *B. juncea* 'Aurea' (carrying *Rlm6*) as previously described (Van de Wouw et al., 2014). Expression of the transgene during infection was confirmed by RT-PCR. Primers used in vector construction and RT-PCR are listed in the Supplementary Table S4.

### Genomic Context, RIP, and Repeat Regions

The prediction suite REPET 2.2 (Flutre et al., 2011) was used for the detection and annotation of transposable elements (TEs). RipCrawl6 was used to predict areas of the genome that had undergone RIP based on composite RIP indices (CRI) with the same restraints that have previously been reported (de Wit et al., 2012). Bedtools (Quinlan and Hall, 2010) was used to calculate distances from *ALVi* genes to features of interest. The above analysis was also done on a set of 439 core eukaryotic reference genes identified using the NCBI eukaryotic clusters of orthologous groups (KOGs). The KOG numbers and the corresponding gene names in *V. inaequalis* isolate Vi1 are listed in the Supplementary Table S5.

<sup>6</sup>https://bitbucket*.*org/arobinson/sciencescripts

## RESULTS

### *AvrLm6* Homologs are Found Across Multiple Fungal Genera and Form Expanded Gene Families in *V. pirina* and *V. inaequalis*

In addition to the *AvrLm6* homologs that were previously reported in the genomes of *C. higginsianum*, *C. gloeosporioides*, *F. oxysporum,* and *L. biglobosa* (Grandaubert et al., 2014), we report additional *AvrLm6* homologs in *C. orbiculare, C. fioriniae,* and *F. oxysporum* f. sp. *raphani* (**Figure 1**). We observed differences in the copy number of homologs identified in each genome which varied, even within species (**Figure 1**). *Venturia* genomes had up to 30 copies, whereas no other genera had more than 3. In the case of *F. oxysporum* and *Leptosphaeria* species there are also isolates which appear to have no homologs. When all ALVi and ALVp predicted protein sequences were used as blast queries of the NCBI nr database, the levels of sequence identity to AvrLm6 homologs ranged from 24 to 41%. Only two sequences returned the *L. maculans* AvrLm6 protein as the most similar (ALVp\_11032\_13 from *V. pirina* isolate 11032 and ALVi\_1389\_1 from Vi1389). All other *V. pirina* and *V. inaequalis* homologs were more similar to AvrLm6 homologs from either *Fusarium* or *Colletotrichum* species than to AvrLm6, and none had similarity to any other *L. maculans* proteins other than to AvrLm6 or related homologs. The multiple sequence alignment revealed four cysteine residues that are conserved in all homologs (**Figure 2**), despite low sequence similarity overall. *ALVis* and *ALVps* also have different gene structure when compared to the homologs found in all the other species, which are generally well conserved with four exons in all homologs except EGU73747.1 from *F. oxysporum* Fo5176 and XP\_003843096.1 from *L. maculans* which both have seven exons (**Figure 1**). The gene structures of *ALVp*s and *ALVi*s vary, in that they either have no introns, one intron in the five prime un-translated region (5 UTR), or either one or two introns in the coding sequence of the gene. The most common structure found in the *ALVi*s is a single intron in the 5 UTR present in 19 of the 24 *ALVi*s identified in Vi1. We were able to confirm these predicted gene structures in 11 of the 24 genes with reference to mapped RNA-seq reads. All of the AvrLm6 homologs identified in public databases, (with the exception of *F. oxysporum* EGU73747.1) and all of the ALVi and ALVp members we have described here, are predicted to contain a signal peptide sequence at their N terminus and so are expected to be secreted by the classical secretory pathway.


FIGURE 1 | Summary of the *AvrLm6* homologs identified in public and private sequence databases. The number of exons predicted for each gene are shown as well as the percentage amino acid identity to the AvrLm6 predicted protein (CAJ90695) from blastp searches. Accession numbers and further details of homologs are detailed in the Supplementary Table S1.

### Expansion of *ALVp* and *ALVi* Gene Families

The maximum likelihood tree derived from the multiple sequence alignment of the ALVi and ALVp amino acid sequences (**Figure 3**) formed four strongly supported clades with 95, 93, 100, and 94% support that included only ALVp sequences, and one large clade, with 77% support, containing only ALVi sequences. Mixed clades containing sequences from both species did not have strong statistical support (**Figure 3**). When AvrLm6 from *L. maculans* and orthologues from *Fusarium* and *Colletotrichum* species are included in the tree, there were many clades with low support (**Figure 4**) making it difficult to resolve the evolutionary relationship between *ALVi* and *ALVp* families with orthologues in those species, however, a two well supported clades of ALVp and ALVi predicted proteins can be resolved supporting the recent expansion of the gene family in *Venturia* sp. AvrLm6 clustered in a strongly supported clade with the homologs from *L. biglobosa,* but there was not strong statistical support for clustering between these *Leptosphaeria* predicted proteins and those of any other species.

### *ALVis* are Found in Gene Poor Regions Associated with Transposable Elements

*ALVi* genes in the Vi1 genome are found in gene poor regions (**Figure 5**) when compared with a reference set of 439 core eukaryotic genes (Supplementary Table S5). The mean distance to the nearest gene was 1,370 bp for the ALVi-coding loci and 484 bp for the core eukaryotic genes, the difference between the means was found to be statistically significant by a Student's *t*-test (*p*-value *<* 0.001). *ALVi* sequences are found in close proximity to TEs predicted with the REPET 2.2 pipeline. The mean distance to the nearest TE from an ALVi gene in isolate Vi1 (203 bp) was significantly less (Student's *t*-test *p*-value *<* 0.00001), than that of the reference set of core genes (5,882 bp). The most common class of TE associated with the *ALVi* genes was the large retrotransposon derivative (LARD) class, 10 of these occurred within 260 bp of an *ALVi* gene and of those, six overlapped *ALVi*-coding loci. Terminal-repeat retrotransposons in miniature (TRIMs) were the second most common element identified at *ALVi* loci, overlapping with eight *ALVi*-coding sequences. Using the CRI index criteria that has been previously defined (de Wit

et al., 2012) we could not find evidence of RIP in any of the *ALVi* coding sequences in isolate Vi1. However, evidence of RIP was found throughout the genome, with 72% of the TEs predicted to be RIP-affected. None of the TEs overlapping the *ALVi* sequences were shown to be RIP affected, but seven *ALVis* loci had evidence of RIP in the nearest neighboring repetitive elements. These elements included two predicted TRIMs, one LARD, as well as fragments of; long terminal repeats (x2) a terminal inverted repeat and a long interspersed nuclear element.

### ALVi\_Vi1\_5 Localizes to the Stroma during Infection

Fluorescent signal from the ALVi\_Vi1\_5 fusion protein was visible at 7 and 14 dpi in infected apple hypocotyls (**Figure 6**) but not visible at the earlier infection time points or at any timepoints *in vitro.* The signal appeared to localize to the periphery of the sub-cuticular stroma and was not visible in surface hyphae, conidiophores or spores. The similar levels of fluorescence at seven and 14 dpi reflected the qRT-PCR data (Supplementary Figure S2) which showed that expression was similar at these time points and higher than that observed at 3 dpi.

### ALVi\_Vi\_1\_8 does not Trigger Rlm6-mediated Resistance in Canola

Expression of ALVi*\_Vi1\_8* was confirmed by RT-PCR in the *L. maculans* (Isolate M1) transformants containing the complementation vector (Supplementary Figure S2), but none of these transformants triggered a resistance response in *B. juncea* 'Aurea,' expressing the *Rlm6* resistance gene (Supplementary Figure S3). The expected resistance response was observed when *B. juncea* 'Aurea' was infected with *L. maculans* (Isolate M1) transformed with the *AvrLm6* avirulence gene. The pathogenicity of the ALVi*\_Vi1\_8* transformed isolates on the susceptible

*B. napus* 'Westar' cultivar was indistinguishable from the wild type *L. maculans* expressing the native *AvrLm6* gene and the wild type *L. maculans* M1 isolate.

### DISCUSSION

Fungal effector proteins have long been characterized by their lineage-specific distribution, and were generally thought to be conserved at the species or race level. A rapid rise in the availability of plant pathogen whole genome sequences, especially for those classified within the Dothideomycetes and the Sordariomycetes, has enabled this dogma to be tested by sequence similarity searches. In this way, new homologs of the *L. maculans* effector, AvrLm6, have been identified. The taxonomic distribution of AvrLm6 orthologues is interesting, because while they are found across two different classes of fungi, the Dothideomycetes and the Sordariomycetes, they have apparently only been conserved in a few species within these classes. Furthermore, AvrLm6 from *L. maculans* has higher sequence similarity to orthologues in the Sordariomycetes class than to orthologues from the more closely related *Venturia* species (Supplementary Figure S1). One possible explanation of the unusual taxonomic distribution of these genes could be horizontal gene transfer (HGT). There are now many examples of HGT in fungal and oomycete plant pathogens (Soanes and Richards, 2014). The most compelling evidence of HGT is *ToxA,* found in two wheat pathogens. The *ToxA* gene in *Parastagonospora nodorum* appears to have been horizontally transferred from *Pyrenophora tritici-repentis*. The *ToxA* genes in both species had very high nucleotide sequence identity (99.7%) which did not concur with the phylogenetic relationship inferred when comparing the ITS regions and glyceraldehyde-3-phosphate genes (83 and 80% respectively; Friesen et al.,

2006). In the case of *AvrLm6* though, the homologs have a high level of sequence divergence and while *AvrLm6* did cluster with the *Colletotrichum* homologs in the phylogenetic tree, this relationship was not supported with bootstrapping. Even within the same species, the presence of these genes is not conserved, for example, we could only identify homologs in two isolates of *F. oxysporum* with blast searches, despite the 25 searchable genomes in the NCBI database. This finding is not simply due to poor gene models in the WGS of fungi as six frame similarity searches also did not reveal orthologues. The *AvrLm6* allele is also highly polymorphic at the population level and can also be absent in some *L. maculans* isolates (Van de Wouw et al., 2010). The deletion of avirulence effector loci is one way that pathogens avoid recognition by resistant plants. The function of AvrLm6 is unknown, however, it is not essential for pathogenicity in *L. maculans.* This may be why it is so poorly conserved in the Dothideomycetes in general. The expansion of the *AvrLm6*-like genes in *Venturia* sp. is clearly against the trend in this group.

In *V. inaequalis* and *V. pirina,* copy number of *AvrLm6*-like genes varies between isolates and species. The *ALVi* and *ALVp* gene family expansions appear to have occurred independently after speciation. This can be seen on the gene tree (**Figure 3**), where strongly supported clades are composed of clusters of paralogues rather than orthologues. *V. inaequalis* and *V. pirina* species are restricted to and separated spatially on different host ranges, but otherwise have very similar lifestyles and biology, hence, differences in the evolution of these gene families may reflect adaptation to a new host.

The mechanism/s responsible for the expansion of the *ALVp* and *ALVi* gene families remains unclear. However, the association between *ALVis* and TEs, predominantly LARDs and TRIMs, suggests that the multiple gene duplications could have been mediated by these elements. Indeed the expansion of the *AVRk1* effector gene family in *Blumeria graminis* has been hypothesized to be driven by the association with LINE1 retrotransposons (Sacristán et al., 2009).

Large retrotransposon derivatives are non-autonomous long terminal repeat (LTR) retrotransposons which have been described in *Triticeae* (Kalendar et al., 2004), rice (Nagaki et al., 2005; Vitte et al., 2007) and fungi (Labbé et al., 2012; Pereira et al., 2015); they consist of LTRs flanking a large non-coding

Vi1 growing on apple hypocotyl viewed with bright field microscopy (right panels) and confocal fluorescent (left panels) 7 days post inoculation (A) and 14 days post inoculation (B). YFP fluorescence observed in multicellular, sub-cuticular stroma only. Fluorescent images are z stacks.

internal domain, and lack the coding domains found in other LTR retrotransposons, which are responsible for self-replication (Kalendar et al., 2004). TRIMs, like LARDs, are non-autonomous retrotransposons, but differ structurally from LARDs, in that they have much shorter internal domains and terminal direct repeats (Witte et al., 2001). TRIMs have been observed within coding and non-coding regions of plant (Witte et al., 2001; Sampath and Yang, 2014) and animal genes (Zhou and Cahan, 2012). It has been observed that TRIMs can be involved in the transduction of host genes, a situation where a host gene or part of host gene becomes part of the replicating transposon, and can lead to gene duplication (Witte et al., 2001). A TRIM identified in *Arabidopsis thaliana* (*Katydid*-At1) was found to contain an open reading frame very similar to the host gene coding for the mRNA decay (NMD) trans-acting factor, as none of the intronic sequences were found in the transduced gene. It was hypothesized that this even occurred by the recombination of the host gene transcript and the TRIM (Witte et al., 2001). Such a method of gene family evolution could account for the variability in gene structure and expansion of the *ALVps* and *ALVis* in *Venturia*. Another explanation is that part of the TRIM may have been recruited by the *ALVi* genes and expansion occurred by another mechanism. The genomic context of the *ALVi*s is similar to that of *Avrlm6*. *AvrLm6* in the *L. maculans* genome (strain v23.1.3) is situated in a gene-poor and AT-rich isochore, adjacent to an abundance of repetitive elements and remnants of transposons (Fudal et al., 2007), however, TRIMs have not been reported in *L. maculans*.

Gene sparse regions are a feature of some filamentous plant pathogens and can be niches for effector genes and their evolution (Raffaele and Kamoun, 2012). Despite sharing a similar genomic context to *AvrLm6*, RIP was not detected in any of the *ALVi* sequences in *V. inaequalis.* RIP was detected bioinformatically in other regions of the genome and the majority of TEs were predicted to be affected by RIP, so it appears that RIP machinery is or has been active in the *V. inaequalis* genome. It remains unclear how the *ALVi* gene family was able to expand without being affected by RIP. However, RIP has not been investigated experimentally in *Venturia* sp, so the constraints of the RIP machinery are unknown. It has been shown experimentally in *Neurospora crassa* that duplicate nucleotide sequences of less than 380 were not subject to RIP (Watters et al., 1999). While most *ALVi* sequences are longer than this, it is possible that the *V. inaequalis* RIP machinery has other constraints that preclude the *ALVi* genes from being targets for RIP.

*Venturia* species, upon infection, rapidly form stroma (multicellular, laterally dividing tissue) in the cuticle and subcuticular space (Bowen et al., 2011). It is thought that the stroma is the main feeding and conidia-producing structure in *Venturia*. The *AvrLm6*-like homolog, ALVi\_VI1\_5 was localized, as a fusion protein with eYFP, to the periphery of the subcuticular stroma cells during infection. This is consistent with the gene expression data with regards to the timing of appearance of stroma during infection. All of the ALVis and ALVps are predicted to be secreted by the classical secretory pathway, however, we still do not know the function of this protein or whether it is released into the apoplast to interface with the plant.

The ALVi\_Vi1\_8 protein failed to trigger a resistance response in canola giving no indication as to whether the protein behaves in a similar way to AvrLm6. *R* genes have been described previously which recognize interspecific effectors with relatively low sequence similarity; for example the Cf-4–mediated HR in tomato can be triggered by the Avr4 protein from *C. fulvum* as well as homologs from phylogenetically related species; the MfAvr4 protein from *M. fijiensis* which only share 42% amino acid identity (Stergiopoulos et al., 2010) and DsAvr4 from *D. septosporum* which shares 51.7% identity (de Wit et al., 2012). It has also been frequently demonstrated, however, that small variations in effector sequence can abolish avirulence functionality. Such is the case with the Cf-4 mediated resistance described above, which can be disrupted by exchanging a single conserved proline residue (Mesarich et al., 2015). An *AvrLm6*

### REFERENCES


allele with a single amino acid substitution has also been found in Rlm6-virulent isolates collected from the field (Van de Wouw et al., 2010).

The ALVp and ALVi predicted proteins have typical characteristics of effectors; i.e., they are all small, predicted to be secreted, cysteine rich, contain no known functional domains, and many are upregulated during infection compared to *in vitro* growth. Furthermore, these expanded protein families share sequence similarity to an avirulence effector, known to trigger a specific resistance response in canola to *L. maculans*. The variability in copy number and sequence divergence of ALVis and ALVps between different species and isolates of *Venturia,* also make these proteins strong candidates for effectors (possibly host specificity determinants). Elucidating the roles of the *Venturia* ALVp and ALVi proteins will be challenging, as there is likely to be a degree of functional redundancy among them. The availability of the whole genome and transcriptome sequences for these fungi will facilitate the determination of the genetic basis of physiological races, governing host and cultivar specificity in *Venturia*. This in turn will assist plant breeding efforts to select for more durable resistance against scab diseases.

### ACKNOWLEDGMENTS

JS was supported by Scholarship from La Trobe University and a Victorian Life Sciences Computation Initiative PhD Top-up scholarship. AR is supported under the Victorian Life Sciences Computation Initiative's (VLSCI) Life Sciences Computation Centre, a collaboration between LTU, Melbourne and Monash Universities (an initiative of the Victorian Government, Australia); We would also like to thank Dr Carl Mesarich for providing us with the eYFP gene optimized for *Venturia inaequalis* codon usage and Dr Rohan Lowe for the *L. maculans lepidii* sequence data. We would also like to thank Remmelt Groenwold (Wageningen UR, The Netherlands) and Bruno Le Cam (INRA, Angers, France) for the *V. inaequalis* isolates.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal*.*frontiersin*.*org/article/10*.*3389/fpls*.*2015*.*00980


1639 progeny of *Venturia inaequalis*. *Fungal Genet. Biol.* 48, 166–176. doi: 10.1016/j.fgb.2010.09.001


*inaequalis*, the devastating apple scab pathogen. *PLoS ONE* 8:e53937. doi: 10.1371/journal.pone.0053937


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Shiller, Van de Wouw, Taranto, Bowen, Dubois, Robinson, Deng and Plummer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*