# ENGINEERING THE PLANT FACTORY FOR THE PRODUCTION OF BIOLOGICS AND SMALL-MOLECULE MEDICINES

EDITED BY: Domenico De Martinis, Rosella Franconi, Eugenio Benvenuto, Edward P. Rybicki and Kazuhito Fujiyama PUBLISHED IN: Frontiers in Plant Science and Frontiers in Bioengineering and Biotechnology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-051-0 DOI 10.3389/978-2-88945-051-0

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **ENGINEERING THE PLANT FACTORY FOR THE PRODUCTION OF BIOLOGICS AND SMALL-MOLECULE MEDICINES**

#### Topic Editors:

**Domenico De Martinis,** ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy **Rosella Franconi,** ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy **Eugenio Benvenuto,** ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy **Edward P. Rybicki,** University of Cape Town, South Africa **Kazuhito Fujiyama,** Osaka University, Japan

Knock-out by CRISPR/Cas9 of a gene coding for the fluorescent protein mCherry in tobacco calli Adapted by Mercx et al. Front. Plant Sci., 01 February 2016 http://dx.doi.org/10.3389/ fpls.2016.00040 Image taken by Sébastien Mercx

Plant gene transfer achieved in the early '80s paved the way for the exploitation of the potential of gene engineering to add novel agronomic traits and/or to design plants as factories for high added value molecules. For this latter area of research, the term "Molecular Farming" was coined in reference to agricultural applications in that major crops like maize and tobacco were originally used basically for pharma applications.

The concept of the "green biofactory" implies different advantages over the typical cell factories based on animal cell or microbial cultures already when considering the investment and managing costs of fermenters. Although yield, stability, and quality of the molecules may vary among different heterologous systems and plants are competitive on a case-to-case basis, still the "plant factory" attracts scientists and technologists for the challenging features of low production cost, product safety and easy scale up. Once engineered, a plant is among the cheapest and easiest eukaryotic system to be bred with simple know-how, using nutrients, water and light. Molecules that are cur-

rently being produced in plants vary from industrial and pharmaceutical proteins, including medical diagnostics proteins and vaccine antigens, to nutritional supplements such as vitamins, carbohydrates and biopolymers. Convergence among disciplines as distant as plant physiology and pharmacology and, more recently, as omic sciences, bioinformatics and nanotechnology, increases the options of research on the plant cell factory.

"Farming for Pharming" biologics and small-molecule medicines is a challenging area of plant biotechnology that may break the limits of current standard production technologies. The recent success on Ebola fighting with plant-made antibodies put a spotlight on the enormous potential of next generation herbal medicines made especially in the name of the guiding principle of reduction of costs, hence reduction of disparities of health rights and as a tool to guarantee adequate health protection in developing countries.

**Citation:** De Martinis, D., Franconi, R., Benvenuto, E., Rybicki, E. P., Fujiyama, K., eds. (2017). Engineering the Plant Factory for the Production of Biologics and Small-Molecule Medicines. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-051-0

# Table of Contents


Valentina Passeri, Ronald Koes and Francesca M. Quattrocchio

*109 The Encapsulation of Hemagglutinin in Protein Bodies Achieves a Stronger Immune Response in Mice than the Soluble Antigen*

Anna Hofbauer, Stanislav Melnik, Marc Tschofen, Elsa Arcalis, Hoang T. Phan, Ulrike Gresch, Johannes Lampel, Udo Conrad and Eva Stoger

	- and Dominique Michaud

Xuan Huang, Jingwen Yao, Yangyang Zhao, Dengfeng Xie, Xue Jiang and Ziqin Xu


Martina Dicker, Marc Tschofen, Daniel Maresch, Julia König, Paloma Juarez, Diego Orzaez, Friedrich Altmann, Herta Steinkellner and Richard Strasser

*209 Seed-Specific Expression of Spider Silk Protein Multimers Causes Long-Term Stability*

Nicola Weichert, Valeska Hauptmann, Christine Helmold and Udo Conrad


E. Federico Alfano, Ezequiel M. Lentz, Demian Bellido, María J. Dus Santos, Fernando A. Goldbaum, Andrés Wigdorovitz and Fernando F. Bravo-Almonacid *251 Application of a Scalable Plant Transient Gene Expression Platform for Malaria Vaccine Development*

Holger Spiegel, Alexander Boes, Nadja Voepel, Veronique Beiss, Gueven Edgue, Thomas Rademacher, Markus Sack, Stefan Schillberg, Andreas Reimann and Rainer Fischer

*266* **N***-Glycosylation of Cholera Toxin B Subunit: Serendipity for Novel Plant-Made Vaccines?*

Nobuyuki Matoba

*273 Depth Filters Containing Diatomite Achieve More Efficient Particle Retention than Filters Solely Containing Cellulose Fibers*

Johannes F. Buyel, Hannah M. Gruchow and Rainer Fischer


Vanesa S. Marín Viegas, Gonzalo R. Acevedo, Mariela P. Bayardo, Fernando G. Chirdo and Silvana Petruccelli

*303 A Decade of Molecular Understanding of Withanolide Biosynthesis and* **In vitro** *Studies in* **Withania somnifera** *(L.) Dunal: Prospects and Perspectives for Pathway Engineering*

Niha Dhar, Sumeer Razdan, Satiander Rana, Wajid W. Bhat, Ram Vishwakarma and Surrinder K. Lattoo

*323 Modulation of Chloride Channel Functions by the Plant Lignan Compounds Kobusin and Eudesmin*

Yu Jiang, Bo Yu, Fang Fang, Huanhuan Cao, Tonghui Ma and Hong Yang

*334 Bioconversion to Raspberry Ketone is Achieved by Several Non-related Plant Cell Cultures*

Suvi T. Häkkinen, Tuulikki Seppänen-Laakso, Kirsi-Marja Oksman-Caldentey and Heiko Rischer

*343 Gene-to-metabolite network for biosynthesis of lignans in MeJA-elicited* **Isatis indigotica** *hairy root cultures*

Ruibing Chen, Qing Li, Hexin Tan, Junfeng Chen, Ying Xiao, Ruifang Ma, Shouhong Gao, Philipp Zerbe, Wansheng Chen and Lei Zhang

*358 Plant-derived SAC domain of PAR-4 (Prostate Apoptosis Response 4) exhibits growth inhibitory effects in prostate cancer cells*

Shayan Sarkar, Sumeet Jain, Vineeta Rai, Dipak K. Sahoo, Sumita Raha, Sujit Suklabaidya, Shantibhusan Senapati, Vivek M. Rangnekar, Indu B. Maiti and Nrisingha Dey

*375 Commentary: Extracellular peptidase hunting for improvement of protein production in plant cells and roots*

Karl J. Kunert and Priyen Pillay

# Editorial: Plant Molecular Farming: Fast, Scalable, Cheap, Sustainable

Domenico De Martinis <sup>1</sup> \*, Edward P. Rybicki <sup>2</sup> , Kazuhito Fujiyama<sup>3</sup> , Rosella Franconi <sup>1</sup> and Eugenio Benvenuto<sup>1</sup>

*<sup>1</sup> ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Rome, Italy, <sup>2</sup> Biopharming Research Unit, Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa, <sup>3</sup> Osaka University, Osaka, Japan*

Keywords: plant molecular farming, biopharmaceuticals, metabolic engineering, recombinant protein, biobetter, genetic engineering, transient expression, plant factory

**The Editorial on the Research Topic**

#### **Engineering the Plant Factory for the Production of Biologics and Small-Molecule Medicines**

The transfer of genes into plants, that was achieved in the early 80's, paved the way for the exploitation of the potential of plant genetic engineering, to add novel agronomic traits and/or to design plants as factories for high added value molecules. For this latter area of research, the term "Molecular Farming" was coined because major crops like maize and tobacco were originally used basically for pharma applications.

In this research topic we have tried to gather together the scientific community working on the concept of plant biofactories: this has eventually resulted in a comprehensive display of studies (33 papers from the Americas, Europe, South Africa, India, Australia, Japan, and China) that approach the complexity of producing desired molecules in plants and plant cells, covering the topic from small, but tricky, metabolites to large chimeric proteins.

To develop plant-based "green biofactory" implies advantages over the more conventional cell factories based on animal cells or microbial cultures, when considering the investment and managing costs of fermenters. Nevertheless, when dealing with any biofactory, some challenges remain the same: the feature of the product to be obtained, the engineering of the host, and the production and purification steps that may cause more than "just a headache."

The studies describe several different approaches to understanding how to boost production of the desired product by molecular engineering (Diamos et al.; Xu et al.; Gurkok et al.; Mercx et al., Dhar et al.) or via biochemical or environmental stimuli (Fujiuchi et al.; Huang et al.; Jiang et al.); how to better store or deliver the desired product (Ceresoli et al.; Passeri et al.; Weichert et al.; Alfano et al.); how to make the product more stable (Mandal et al.; Dicker et al.; Kunert and Pillay); and how to obtain a better purification yield (Sainsbury et al.; Buyel et al.) and better performance (Hofbauer et al.; Matoba) of the molecule.

Thus, although yield, stability, and quality of the molecules may vary among different systems, plants are strongly competitive on a case-to-case basis, and both the molecular design and the plasticity in place and time of production may provide distinct advantages (e.g., use of cell suspensions: Corbin et al.; Santos et al., roots: Häkkinen et al.; Chen et al. or by transient expression rather than stable transformation: Alkanaimsh et al.; Westerhof et al.). For these reasons engineering the plant factory for the production of biologics and small-molecule medicines attracts scientists and technologists for the intriguing features of low production cost, product safety, easy scale-up and the possibility to produce "biobetters."

Edited and reviewed by: *James Lloyd, Stellenbosch University, South Africa*

> \*Correspondence: *Domenico De Martinis domenico.demartinis@enea.it*

#### Specialty section:

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

Received: *21 June 2016* Accepted: *18 July 2016* Published: *03 August 2016*

#### Citation:

*De Martinis D, Rybicki EP, Fujiyama K, Franconi R and Benvenuto E (2016) Editorial: Plant Molecular Farming: Fast, Scalable, Cheap, Sustainable. Front. Plant Sci. 7:1148. doi: 10.3389/fpls.2016.01148*

Molecules that are currently being produced in plants exploit only a little of the immense potential to produce natural compounds (Pulice et al.; Andre et al.), as well as nutritional supplements such as vitamins, carbohydrates and biopolymers (see also previous references) and industrial and pharmaceutical proteins.

The latest products described here promise to provide tools to tackle serious medical challenges, from chronic ones (e.g. Celiac disease, Viegas et al., and Prostate Cancer, Sarkar et al.) to dangerous infections with pandemic potential, such as SARS (Demurtas et al.), influenza (Mbewana et al.), malaria (Spiegel et al.) as well as Salmonella (Miletic et al.). Interestingly, this last panel of publications highlights the modularity of molecular engineering systems that could be platforms for genetic engineering and provision of fast and scalable systems to be used in response to new outbreaks of highly infectious diseases.

Convergence among disciplines as distant as plant physiology and pharmacology and, more recently, the "-omics" sciences, as well as bioinformatics and nanotechnology, increases the options for research on the plant cell factory. Once suitably engineered, a plant is possibly the cheapest and easiest eukaryotic system to be adapted to production of pharmaceuticals, as they can be bred

#### REFERENCES


with simple know-how, and grown using only simple nutrients, water and light.

These approaches suggest a future, modular approach to protein design that could represent a new trend in the field (De Paoli et al., 2016) "Farming for Pharming" of biologics and small-molecule medicines is a challenging area of plant biotechnology that may break the limits of current standard production technologies. Market approval of "Elelyso" in 2012 (Protalix/Pfizer, recombinant Glucocerebrosidase produced in carrot cells for treatment of a rare disease) and the recent apparent success in fighting Ebola virus with plant-made antibodies put a spotlight on the enormous potential of next generation plant-made medicines, made especially in the name of the guiding principle of reduction of costs: these will help reduce disparities in health rights as well as tools to guarantee adequate health protection in developing countries (Hinman and McKinlay, 2015; Folayan et al., 2016).

#### AUTHOR CONTRIBUTIONS

All authors contributed equally to the manuscript, within their role as editors of the topic.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 De Martinis, Rybicki, Fujiyama, Franconi and Benvenuto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Transient Expression of Tetrameric Recombinant Human Butyrylcholinesterase in Nicotiana benthamiana

Salem Alkanaimsh<sup>1</sup> , Kalimuthu Karuppanan<sup>1</sup> , Andrés Guerrero<sup>2</sup> , Aye M. Tu<sup>3</sup> , Bryce Hashimoto<sup>1</sup> , Min Sook Hwang<sup>4</sup> , My L. Phu<sup>3</sup> , Lucas Arzola<sup>1</sup> , Carlito B. Lebrilla<sup>2</sup> , Abhaya M. Dandekar<sup>3</sup> , Bryce W. Falk<sup>4</sup> , Somen Nandi5,6, Raymond L. Rodriguez5,6 and Karen A. McDonald1,6 \*

<sup>1</sup> Department of Chemical Engineering, University of California, Davis, Davis, CA, USA, <sup>2</sup> Department of Chemistry, University of California, Davis, Davis, CA, USA, <sup>3</sup> Department of Plant Science, University of California, Davis, Davis, CA, USA, <sup>4</sup> Department of Plant Pathology, University of California, Davis, Davis, CA, USA, <sup>5</sup> Department of Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA, <sup>6</sup> Department of Global HealthShare Initiative, University of California, Davis, Davis, CA, USA

#### Edited by:

Kazuhito Fujiyama, Osaka University, Japan

#### Reviewed by:

Ko Kato, Nara Institute of Science and Technology, Japan Qiang Chen, Arizona State University, USA

> \*Correspondence: Karen A. McDonald

kamcdonald@ucdavis.edu

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 15 December 2015 Accepted: 17 May 2016 Published: 16 June 2016

#### Citation:

Alkanaimsh S, Karuppanan K, Guerrero A, Tu AM, Hashimoto B, Hwang MS, Phu ML, Arzola L, Lebrilla CB, Dandekar AM, Falk BW, Nandi S, Rodriguez RL and McDonald KA (2016) Transient Expression of Tetrameric Recombinant Human Butyrylcholinesterase in Nicotiana benthamiana. Front. Plant Sci. 7:743. doi: 10.3389/fpls.2016.00743 To optimize the expression, extraction and purification of plant-derived tetrameric recombinant human butyrylcholinesterase (prBChE), we describe the development and use of plant viral amplicon-based gene expression system; Tobacco Mosaic Virus (TMV) RNA-based overexpression vector (TRBO) to express enzymatically active FLAGtagged plant made recombinant butyrylcholinesterase (rBChE) in Nicotiana benthamiana leaves using transient agroinfiltration. Two gene expression cassettes were designed to express the recombinant protein in either the ER or to the apoplastic compartment. Leaf homogenization was used to isolate ER-retained recombinant butyrylcholinesterase (prBChE-ER) while apoplast-targeted rBChE was isolated by either leaf homogenization (prBChE) or vacuum-extraction of apoplastic wash fluid (prBChE-AWF). rBChE from apoplast wash fluid had a higher specific activity but lower enzyme yield than leaf homogenate. To optimize the isolation and purification of total recombinant protein from leaf homogenates, an acidic extraction buffer was used. The acidic extraction buffer yielded >95% enzymatically active tetrameric rBChE as verified by Coomassie stained and native gel electrophoresis. Furthermore, when compared to human butyrylcholinesterase, the prBChE was found to be similar in terms of tetramerization and enzyme kinetics. The N-linked glycan profile of purified prBChE-ER was found to be mostly high mannose structures while the N-linked glycans on prBChE-AWF were primarily complex. The glycan profile of the prBChE leaf homogenates showed a mixture of high mannose, complex and paucimannose type N-glycans. These findings demonstrate the ability of plants to produce rBChE that is enzymatically active and whose oligomeric state is comparable to mammalian butyrylcholinesterase. The process of plant made rBChE tetramerization and strategies for improving its pharmacokinetics properties are also discussed.

Keywords: butyrylcholinesterase, Nicotiana benthamiana, plant viral expression system, transient protein production, plant N-glycosylation, tetramerization

# INTRODUCTION

fpls-07-00743 June 15, 2016 Time: 17:30 # 2

Organophosphate (OP) nerve agents such as Sarin (Pohanka et al., 2013) have been used in recent history against civilian populations in Japan (Seto et al., 1999) and in Charbonneau and Nichols (2013). Their acute toxicity is due to an irreversible inhibition of human acetylcholinesterase (hAChE; Moshiri et al., 2012), which leads to accumulation of acetylcholine in the synaptic cleft followed by overstimulation of cholinergic receptors and death if left untreated (Trovaslet-Leroy et al., 2011). Although treatment options such as oximes, and atropine (Lenz et al., 2007) are available, they are usually administered as postexposure prophylaxis and can have detrimental side effects such as damage to the central nervous system (Trovaslet-Leroy et al., 2011). A better treatment option is human butyrylcholinesterase, (hBChE), a bio-scavenger that specifically and irreversibly binds OP compounds with a MRT in the bloodstream of approximately 12 days (Duysen et al., 2002; Lenz et al., 2007; Saxena et al., 2008).

Human BChE is a glycosylated serine hydrolase that is made in the liver and circulates in the bloodstream catalyzing the hydrolysis of a variety of choline and non-choline esters (Darvesh et al., 2003). It binds to OP compounds irreversibly at a 1:1 stoichiometry (Çokugra¸s, 203 ˇ ). Circulating human BChE is a homotetrameric protein consisting of 85 kDa monomers comprised of 574 amino acids (Ngamelue et al., 2007). Monomers possess ten potential N-glycosylation sites of which, nine are known to be occupied (Kolarich et al., 2008). Tetramerization and sialylation of bi-antennary galactose residues on N-linked glycans of hBChE are known to be essential for long MRT in the bloodstream (Saxena et al., 2008). Because the concentration of hBChE in the blood is very low, extracting and purifying gram or kilogram quantities of hBChE from blood is impractical (Browne et al., 1998). This has led researchers to explore heterologous expression systems to produce functional recombinant hBChE in amounts that are affordable and practical for protecting civilian and military personnel from OP nerve agents. Some of these expression systems include transgenic goats, Chinese hamster ovary (CHO) cell culture, and human embryonic kidney cell culture (Nachon et al., 2002; Chilukuri et al., 2005; Huang et al., 2007). Although these systems are capable of producing functional rBChE, their drawbacks include high fermentation cost, slow cell growth rates, risk of viral infection of mammalian cell cultures and long lag time between gene transfer and lactation for transgenic animals (Houdebine, 2009; Thomas et al., 2011).

Plants, particularly Nicotiana benthamiana (N. benthamiana), have been used to produce rBChE as well as other biopharmaceutical proteins (Goldstein and Thomas, 2004). Some of the advantages of the N. benthamiana expression system are its affordability, low risk of carrying human pathogens, its scalability and ability to glycosylate proteins (Raskin et al., 2002; Obembe et al., 2011; Xu et al., 2012). prBChE produced from transgenic N. benthamiana has been tested successfully as a bio-scavenger against multiple nerve agents (Geyer et al., 2010a,b). However, growth and scale up of transgenic lines can take several months (Garabagi et al., 2012) making it difficult to response rapidly to new chemical or biological challenges to human health. Alternatively, transient expression of prBChE in N. benthamiana can be achieved in 4–12 days, making transient expression systems well suited for rapid production of biodefense agents like hBChE (D'Aoust et al., 2010; Pogue et al., 2010; Schneider et al., 2014a,b). High level, rapid and transient expressing of target proteins can be achieved in N. benthamiana using a plant viral expression vector cloned in Agrobacterium tumefaciens (A. tumefaciens; Hefferon, 2012). Examples of plant viruses being used as viral expression vectors are Tobacco Mosaic VirusTMV (Lindbo, 2007; Kagale et al., 2012), Cucumber Mosaic Virus (CMV; Sudarshana et al., 2006; Hwang et al., 2012) and the Gemini virus-based vectors (Huang et al., 2010). Vacuum agroinfiltration is a fast and efficient mean for introducing recombinant A. tumefaciens harboring a gene of interest into plants. Transcription and translation of the gene starts within a few hours post-infiltration (Arzola et al., 2011).

The aim of this study was to use a plant viral expression system and purification strategies to express enzymatically active, tetrameric prBChE using transient agroinfiltration in N. benthamiana leaves. Two separate expression cassettes were designed to express rBChE in the ER (i.e., ER-retained; **Figure 1A**) or targeted to the apoplast (Apoplast-targeted; **Figure 1B**). Expression cassettes were cloned into the viral vector, TRBO, a TMV RNA-based overexpression vector (Lindbo, 2007). Expression vectors with their cassettes were separately cloned into A. tumefaciens and co-infiltrated into N. benthamiana leaves with the silencing suppressor P19. The levels, specific activities of differentially targeted prBChE (i.e., prBChE-ER vs. prBChE), and differentially extracted apoplast targeted prBChE (i.e., prBChE vs. prBChE-AWF), were estimated. Their N-linked glycan structures were determined. The results of this study indicate that prBChE, extracted from whole leaf homogenates was similar to the hBChE and eqBChE controls in terms of physiochemical properties, tetramerization and kinetic parameters.

# MATERIALS AND METHODS

#### Construction of Gene Expression Cassettes

Two gene expression cassettes were synthesized (GenScript, Piscataway Township, NJ, USA) for both intracellular and extracellular localization of prBChE. One cassette (prBChE+KDEL) was designed to retain prBChE in the ER (**Figure 1A**) while the other (prBChE) was used to target prBChE to the apoplastic space (**Figure 1B**). The hBChE sequence (Gene bank: NP\_000046.1) lacking human signal peptide was codon optimized to a codon-adjusted index of 0.87 (from 0.76) to facilitate expression in N. benthamiana. In both expression cassettes (**Figures 1A,B**), the signal peptide of hBChE was replaced with the 75 base pair sequence coding for the rice alpha-amylase RAmy3D signal peptide (Gene bank: M59351).

**Abbreviations:** Apo, apoplast; AWF, apoplast wash fluid; BChE, native human butyrylcholinesterase; CBS, citric buffered saline; ER, endoplasmic reticulum; eqBChE, equine butyrylcholinesterase; HRGP, hydroxyproline-rich glycopeptides; MRT, mean residence time; PK, pharmacokinetics; prBChE, plantmade recombinant butyrylcholinesterase; PRP, proline-rich peptides; rBChE, recombinant butyrylcholinesterase; TBS, tris buffered saline.

The 3xFLAG tag sequence (Sigma–Aldrich, St. Louis, MO, USA) was inserted between the RAmy3D signal peptide and hBChE sequences in both cassettes.

### Construction of Viral Vector Expression Systems

To transcribe the two different rBChE expression cassettes, the TMV-based plant viral vector system, TRBO, was used. The binary vectors pTRBO-rBChE+KDEL (**Figure 1A**) and pTRBO-rBChE (**Figure 1B**) were made as follows. The hBChE cassette and the backbone vector, pTRBO, were digested with AvrII and PacI restriction enzymes after which the inserts were ligated to the pTRBO yielding pTRBO-rBChE+KEDL and pTRBO-rBChE viral expression vectors (**Figures 1A,B**). The recombinant vectors were transformed to DH5α Escherichia coli competent cells (Invitrogen, Carlsbad, CA, USA) following manufacturer specifications. A. tumefaciens EHA105 was used for transformation as described earlier (Shen and Forde, 1989). To prevent the RNAi-mediated gene silencing defense mechanism in plants, the silencing suppressor gene, P19, under the control of the 35S promoter (Arzola et al., 2011; **Figure 1C**) was coinfiltrated with the TRBO viral expression vector.

#### Agroinfiltration and Incubation of N. benthamiana Plants

Agrobacterium tumefaciens (EHA105) cells containing pTRBOrBChE+KDEL, pTRBO-rBChE, and p35S-p19, were grown in 5 ml LB media having the appropriate antibiotics for 24 h with shaking (250 rpm) at 28◦C. The cultures were transferred into 200 ml LB media having 40 µM of acetosyringone (Sigma– Aldrich, St. Louis, MO, USA) grown overnight at 28◦C with shaking (250 rpm). Cells were harvested by centrifugation at 2,600 g and resuspended in sterile, 10 mM MES buffer (Fisher Scientific, Santa Clara, CA, USA) pH 5.6, 10 mM MgCl<sup>2</sup> and 150 µM acetosyringone to achieve an OD<sup>600</sup> of 0.5. The TRBO expression cassette (pTRBO-rBChE+KDEL or pTRBOrBChE) was mixed in a 1:1 volume ratio with the gene-silencing suppressor (p35S-p19). The mixed agrobacterium suspensions were incubated in the dark for 1 to 3 h before infiltration.

To validate and compare the viral expression of the two TRBO expression cassettes, three, 4–5 weeks old, N. benthamiana plants were used per experiment. The plants were grown in four-inch pots in a greenhouse under 16/8 h (light/dark) cycle. The potted plants were inverted and immersed in 600 ml of the agrobacteria solution containing 0.02% of Silwet-L-77 (Lehle seeds, Round Rock, TX, USA) and placed in a Nalgene container for vacuum application (−25 in Hg) for 2 min before releasing the vacuum. The agroinfiltrated plants were incubated in an environmental growth chamber at 90% humidity and 21◦C for 6 days after which the leaves were cut at the petioles and harvested. The agroinfiltrated leaves were stored at −80◦C for further analysis, or were immediately processed to recover apoplastic fluids.

# Extracting prBChE from agroinfiltrated N. benthamiana Leaves

Plant-derived BChE was extracted from N. benthamiana leaves using two extraction methods. First, leaf homogenates (prBChE-ER and prBChE) were extracted by grinding leaves frozen in liquid nitrogen with the extraction buffer TBS-1 (20 mM Tris-HCl, pH 8, 150 mM NaCl, 0.01% Tween 80) at a ratio of 1 g leaf tissue to 4 ml buffer. The extract was mixed for 30 min at 4◦C before centrifugation at in 3,200 g for 30 min at 4◦C. Supernatant was recovered and stored at 4◦C for subsequent analysis.

## Apoplast Wash Fluid Recovery (AWF) of prBChE from Agroinfiltrated N. benthamiana Leaves

prBChE-AWF was extracted as follows: freshly harvested agroinfiltrated leaves were submerged in a harvest buffer, TBS-2 (20 mM Tris-HCl, pH 8, 150 mM NaCl, 0.02% Silwet L-77) and

placed in a Nalgene container for a 2 min vacuum application. The infiltrated leaves were placed in 50 ml falcon tubes, and centrifuged for 15 min at 4◦C at 250 g. The AWF was recovered and stored at 4◦C.

#### Quantification of Plant Soluble Proteins

Total soluble protein concentration was determined by the Bradford assay (Bio-Rad, Hercules, CA, USA) using bovine serum albumin (BSA; Sigma–Aldrich, St. Louis, MO, USA) as a standard. A standard curve was generated using BSA solutions ranging from 0.05 to 0.5 mg/ml. The absorbance of the standards and samples was measured at 595 nm with a SpectraMax 340C spectrophotometer (Molecular Devices, Sunnyvale, CA, USA).

#### In Vitro Activity Assay of prBChE Protein

A modified Ellman assay (Ellman et al., 1961) was used to quantify the activity of butyrylcholinesterase. S-Butyrylthiocholine (BTCh) iodide (Sigma–Aldrich, St. Louis, MO, USA) and 5, 5<sup>0</sup> -dithiobis-2-nitrobenzoic acid (DTNB; Sigma–Aldrich, St. Louis, MO, USA) dissolved in 0.1 M phosphate buffer, pH 7.4 were used to make a working substrate solution at 0.5 and 0.267 mM, respectively. A volume of 150 µl of the working substrate solution was added to 50 µl of enzyme diluted with phosphate buffer in 96-well plates and the progress of the reaction was monitored in triplicates at 405 nm for 5 min at 25◦C. A specific activity of 260 U/mg was used to convert from units of activity to mg rBChE.

#### SDS-PAGE and Western Blot Analysis

Protein samples and controls were subjected to SDS-PAGE using a 4–15% gradient gel (Bio-Rad, Hercules, CA, USA) under nonreducing and reducing conditions with 5% β-mercaptoethanol (Bio-Rad, Hercules, CA, USA). Briefly, protein samples and controls were heated for 5 min at 95◦C and electrophoresis performed for 35 min at 200 V. Gels were either stained in Coomassie Brilliant Blue G-250 (Bio-Rad, Hercules, CA, USA) or transferred to a 0.45 µm nitrocellulose membrane (Bio-Rad, Hercules, CA, USA) for 90 min at 100 V. Blots were blocked with 5% non-fat dry milk (NFDM) in (1X) PBS, pH 7.4. overnight and washed three times with (1X) PBST buffer at 5-min intervals. The blots were developed with either 1:2,500 dilution of monoclonal anti-FLAG M2-Peroxidase (HRP) antibody (Sigma– Aldrich, St. Louis, MO, USA) in 5% NFDM solution or with 1:200 mouse monoclonal anti-BChE antibody (D-5; Santa Cruz biotechnology, Santa Cruz, CA, USA) in 5% NFDM solution followed by 1:2000 goat anti-mouse HRP conjugated secondary antibody (Santa Cruz Biotechnology, Santa Cruz, CA, USA) in 5% NFDM solution. The blots were incubated for 1 h at room temperature in each antibody solution after which the blots were washed with (1X) PBST buffer three times for 5 min each. TMB stabilized substrate for horseradish peroxidase (Promega, Madison, WI, USA) was used as a color development substrate. Commercially available eqBChE (Sigma–Aldrich, St. Louis, MO, USA) and PEGylated goat-made butyrylcholinesterase (PEGrBChE) kindly provided by Dr. Doug Cerasoli (USAMRICD) were used as controls.

# Purification of prBChE from Total Leaf Homogenates and AWF

To determine the optimal extraction buffer, the following two extraction buffers were used; 50 mM Tris-HCl, pH 8, 250 mM NaCl with 0.01% Tween 80 (TBS-3) and 50 mM citric buffer, pH 4, 250 mM NaCl with 0.01% Tween 80 (CBS). Frozen agroinfiltrated leaves with TRBO constructs with (prBChE+KDEL) and without KDEL sequence (prBChE) were used to purify prBChE-ER and prBChE from total leaf homogenate. Biomass was ground in liquid nitrogen at a ratio of 1 g:4 ml buffer and the amount of enzymatically active prBChE from total leaf homogenate and total soluble proteins were determined.

Based on the buffer screening experiments, CBS buffer was selected for protein recovery from homogenized leaves. The crude extracts (prBChE-ER and prBChE) were filtered through a 0.22 µm filter (EMD Millipore, Chicago, IL, USA) followed by filtration through a 30 kDa Minimate Tangential Flow Filtration Capsule (Pall Corporation, Ann Arbor, MI, USA). The concentrated retentate was loaded on ANTI-FLAG M2 affinity gel (Sigma–Aldrich, St. Louis, MO, USA) and the target protein was captured and eluted based on the manufacturer's specifications. Similarly, the recombinant protein from the AWF was filtered, concentrated using a 30 kDa Amicon centrifugal filter units (EMD Millipore, Chicago, IL, USA) and purified using ANTI-FLAG M2 affinity gel (Sigma–Aldrich, St. Louis, MO, USA).

# Glycopeptide Analysis

#### Sample Preparation

Glycopeptide analysis of the different prBChE products was performed according to the method described by Nwosu et al. (2013) with a few modifications. Briefly, 5 µg of each prBChE sample was denatured for one h at 57◦C in the presence of dithiothreitol. Alkylation was achieved by iodoacetamide addition and incubation in the dark. The alkylated samples were run on SDS-PAGE and Any-kD Mini Protean TGX gels (Bio-Rad, Hercules, CA, USA). After staining with Coomassie Brilliant Blue G-250 (Bio-Rad, Hercules, CA, USA) and rinsing in water, the targeted protein bands were excised from the gel, rinsed and destained using alternating washes (three cycles) of 100 mM ammonium bicarbonate (pH 8.0) and pure acetonitrile. Finally, the excised gel pieces were completely dried under vacuum and treated with 100 µl of a 0.005 µg/µl Pronase (Sigma–Aldrich, St. Louis, MO, USA) solution in 100 mM ammonium bicarbonate (pH 8.0). Digestion was allowed to proceed overnight at 37◦C. The supernatant was withdrawn and dried in a SpeedVac (Genevac EZ-2, Stone Ridge, NY, USA) prior to the MS analysis.

#### Mass Spectrometry

Dried samples were reconstituted in 10 µl of deionized water and were analyzed using an Agilent 6520 Q-TOF mass spectrometer (Agilent Technologies, Santa Clara CA, USA). Tandem MS analysis of the samples was acquired in the positive mode in a data-dependent manner following LC separation on a microfluidic chip packed with porous graphite carbon (PGC),

(Agilent Technologies, Santa Clara CA, USA). The two solutions pumped in these analyses consist of a binary solvent: A, 3.0% ACN/water (v/v) with 0.1% formic acid (FA); B, 90% ACN/water (v/v) with 0.1% FA. A flow rate of 3 µl/min of solvent A was used for sample loading with a 10 µl injection volume. Samples were eluted with 1% B (0.00–2.50 min); 1– 16% B (2.50–20.00 min); 16–44% B (20.00–30.00 min); 44– 99% B (30.00–35.00 min), and 99% B (35.00–45.00 min). Mass calibration was enabled using infused reference masses (ESI-TOF tuning mix G1969-85000, Agilent Technologies, Santa Clara CA, USA). For the tandem MS analysis, ions were subjected to collision-induced fragmentation using collision energies (Vcollision) that were dependent on the m/z value of the quasimolecular ions according to the equation Vcollision = m/z (1.8/100 Da) V–2.4 V. The preferred charge states were set at 2, 3, and >3.

#### Data Analysis

Glycopeptides were assigned based on a combination of accurate mass measurement and tandem MS data. In-house developed software (Glycopeptide Finder; Strum et al., 2013) was used for rapid and automated assignment of the glycopeptides. The assignments were made within a specified tolerance level (≤20 ppm). To identify glycopeptides in the tandem MS data, product ion spectra were sorted by the presence of carbohydratespecific oxonium fragment ions. The glycan moieties of the glycopeptides were confirmed by the presence of B-type and Y-type ions derived from the sequence of glycan fragmentations in the product ion spectra. However, the peptide moieties were mainly confirmed using accurate measurement of their masses in the tandem MS data. Identifications with a confidence level of 95% (based on the Glycopeptide Finder decoy analysis) were considered.

# Determination of Enzymatic Parameters of Plant Recombinant BChE Protein

The kinetic parameters of hBChE and prBChE were determined using a modified Ellman assay described previously. For these determinations, different concentrations of the substrate (BTCh) ranging from 0 to 7.5 mM were used in a 0.267 mM DTNB dissolved in 0.1 M phosphate buffer, pH 7.4. The progress of the reactions was monitored at the same conditions (T = 25◦C and pH 7.4 for 5 min). Initial reaction rates were determined in triplicate at each substrate concentration and were plotted and a non-linear regression analysis was applied using GraphPad Prism ver.6 (La Jolla, CA, USA).

#### Oligomeric State of prBChE Protein

The relative amounts of purified prBChE monomers, dimers, and tetramers were estimated by running hBChE and eqBChE (Sigma–Aldrich, St. Louis, MO, USA) as controls on a native gel. A 7.5% gel (Bio-Rad, Hercules, CA, USA) was used and electrophoresis was performed at 40 V for 7 h at 4◦C. The gel

FIGURE 2 | Quantification of active prBChE protein in crude extracts using TRBO viral system as a function of sublocalization in plant cell and extraction methodology. A modified Ellman assay was used to determine the accumulation level of active prBChE protein species localized in different sub-compartments in plant cells in N. benthamiana leaves agroinfiltrated with TRBO expression system. The apoplast-targeted prBChE was extracted either by leaf homogenization or collecting the AWF. Specific activity was determined as the ratio of active protein estimated from the Ellman assay relative to total soluble protein estimated by Bradford assay. All data plotted are the average of three independent measurements ± SD.

was either stained with Coomassie Brilliant Blue G-250 (Bio-Rad, Hercules, CA, USA) or according to the method of Karnovsky and Roots (Karnovsky and Roots, 1964). The western blot was developed as described earlier.

#### RESULTS

#### Quantifying Differentially Targeted prBChE in Agroinfiltrated N. benthamiana Leaves

The amount of active and differentially targeted prBChE (i.e., with and without the KDEL sequence) produced by TRBO was determined. Total protein from leaf homogenates were extracted with TBS-1 and the level of active prBChE-ER was found to be approximately the same as the prBChE enzyme (**Figure 2**). In addition to isolating apoplast-targeted prBChE from agroinfiltrated N. benthamiana leaf homogenates, vacuum

(prBChE pH 8 extract), lane 5: (3 µg of equine BChE control) loaded under non-reduced conditions. Lane M shows the pre-stained protein molecular weight standards along with the molecular weight in kDa. (B) Western blot analysis using 1:200 mouse anti-BChE antibody and 1:2,000 goat anti-mouse HRP conjugated antibody.

infiltration in TBS-2 was used to obtain prBChE in the apoplast wash fluid (prBChE-AWF). The amount of prBChE-AWF was two orders of magnitude lower than that recovered from whole leaf homogenates, which shows that this protein is poorly secreted to the apoplast (**Figure 2**). Due to the lower amount of total protein extracted from the AWF, the specific activity of the prBChE-AWF was approximately twice that obtained from leaf extracts (**Figure 2**). It is worth noting that and intercellular plant proteins like RuBisCo (large subunits around 50 kDa) were not detected in the AWF (**Figure 3A**). This may explain the higher specific activity obtained for prBChE-AWF compared to total leaf homogenates (**Figure 2**). **Figure 3** compares the prBChE obtained from the total leaf

extract with that recovered from the AWF. As can be seen in **Figures 3A,B**, less prBChE was recovered using AWF method compared to prBChE obtained from homogenizing whole leaf tissue.

# Purification, SDS-PAGE Analysis and LC-MS/MS Analysis of Purified prBChE Protein

Although the specific activity of prBChE-AWF was higher than that from leaf homogenates, it was not used for largescale purification because of its low yield per unit mass. The two extraction buffers, TBS-3 (pH 8) and CBS (pH 4) were used to extract the leaf homogenates. As shown in **Figure 4A**, many intercellular proteins were extracted by total leaf homogenization using TBS-3 pH 8. Extracting prBChE in a high pH buffer and in the presence of many contaminating intercellular proteins (e.g., RuBisCo) lowered the specific activity of prBChE. A comparison of the extractability of enzymatically active prBChE using these two extraction methods is shown in **Figures 4A,B**. The SDS-PAGE analysis (**Figure 4A**) shows that RuBisCo was not extracted by the CBS buffer in contrast to the TBS-3 buffer extraction. As can be seen in **Figure 4A**, there were no prominent proteins migrating around the 50 kDa marker when extracting tobacco leaves with the acidic buffer.

Once the leaf homogenate extraction buffer was optimized, prBChE variants (i.e., prBChE-ER and prBChE) were extracted using the CBS buffer and purified using ANTI-FLAG affinity gel, in addition to purification of prBChE from the AWF. The purified prBChE protein variants were compared to a dilution series of PEG-rBChE protein standard. Although only 7 µg (based on activity) were loaded in lanes 1 and 2 (**Figures 5A,B**), the Western blot shows more intense bands for prBChE when compared to PEG-rBChE bands possibly indicating a less active form of the protein in the prBChE samples.

## Glycopeptide Analysis of Different Subcellular Localized prBChE Protein

For each sample, several glycopeptide compositions were identified. As some of the peptide moieties overlap in mass within the 20 ppm tolerance, site-specific information could not be determined. Nevertheless, as the glycan moieties were unambiguously identified, they were used to compare glycosylation patterns among the different prBChE samples. Extracted glycopeptide ion intensities were grouped and summed up by the composition of the glycan moiety and normalized for comparison. The variation in subcellular localization resulted in different glycosylation patterns on the recovered prBChE (**Figure 6**). As expected, the prBChE-ER showed mainly high-mannose structures (>90%) with small percentages of paucimannosidic-type N-linked glycan and complex N-linked glycans. The majority of the high mannose N-linked glycans had 6–8 mannose residues. Alternatively, prBChE-AWF consisted mostly of complex type N-linked glycans. Finally, prBChE yielded a mixture of different N-linked glycan types. Nearly

25% of the N-linked glycans were high-mannose structures, which are characteristic of the ER-retained prBChE protein. For prBChE, approximately 40% of the total observed N-linked glycans were complex N-linked glycans. A significant amount (approximately 35%) of paucimannosidic-type N-linked glycans was also found in the apoplast-targeted prBChE. The difference in the N-glycosylation between prBChE and prBChE-apoplasttargeted-AWF can be explained by differences in glycan processing as the protein moves along the secretory pathway from the ER to the trans-Golgi network.

#### Enzymatic Properties and Oligomeric State of prBChE Protein

The prBChE protein variant was selected for further characterization since it contains the highest amount of complex N-linked glycans, which are suitable for further in vitro modification including sialylation. The kinetics of butyrylthiocholine (BTCh) hydrolysis by prBChE was compared to the hBChE serum control. As previously described by Radic et al. (1993), BChE enzyme kinetics conforms to a substrate activation kinetic model when BTCh is used as a substrate. The rate of reaction was monitored over a large substrate concentration that ranged from 10 µM to 7.5 mM (**Figure 7**). Non-linear regression analysis shows that the K<sup>m</sup> of prBChE for BTCh was (39 ± 19) µM, compared to the hBChE control (61 ± 40 µM). However, the turnover number of BTCh hydrolysis (kcat) for serum-derived hBChE was higher than that of prBChE.

Different oligomers of prBChE were observed by fractionating the protein by SDS-PAGE under reduced and non-reduced conditions and on corresponding immunoblots (**Figure 8**). Under non-reduced conditions, the prBChE was observed migrating as monomeric, dimeric, and possible tetrameric structures. While under reducing conditions, the monomers of prBChE were the predominant form due to the reduction of intermolecular disulphide bonds. To estimate the various types and proportions of prBChE oligomers, native PAGE analysis was performed under non-denaturing conditions. The recombinant protein showed an oligomeric state similar to that of the human

activity (prBChE vs. human BChE control) over a large substrate concentration (10 µM–7.5 mM) in 0.1 M phosphate buffer pH 7.4, 0.267 mM DNTB.

and equine serum controls. Almost all the protein was in the tetrameric state (**Figures 9A–C**). Denatured prBChE and the equine serum control were analyzed using native PAGE and the migration patterns were found to be similar (**Figure 10**). This indicates that the native form of prBChE is a tetrameric protein. Furthermore, native gel electrophoresis was used to separate the various molecular species in the crude leaf homogenate. This revealed only tetrameric prBCHE relative to the eqBChE (**Figures 11A,B**).

## DISCUSSION

In the quest for a safe, abundant and affordable source of rBChE from plants, studies by the Lockridge group (Blong et al., 1997; Ngamelue et al., 2007; Biberoglu et al., 2012; Larson et al., 2014) and others (Masson et al., 2003; Schneider et al., 2014a,b) have focused attention on the desirable pharmacokinetic (PK) properties of prBChE. For prBChE, desirable PK properties include functionality, reduced risk of immunogenicity-allergenicity and longer MRT in the bloodstream afforded by sialylation of terminal galactose residues on N-linked glycans (Schneider et al., 2014a) and tetramerization of the BuChE monomers. The production of large quantities of any plant-made biopharmaceutical lacking its key PK properties does little to provide effective therapeutic solutions to natural or man-made threats to human health.

In this report, we described the expression of rBChE in N. benthamiana in a manner that yielded enzymatically active protein that was greater than >95% tetrameric (**Figures 9–11**). Enzymatic activity and electrophoretic mobility, on native gels, was similar to that seen with eqBChE and hBChE controls (**Figures 9–11**). In terms of yield, the TRBO expression vector produced prBChE-ER at 0.78 U/mg total soluble protein, which was comparable to that obtained by Geyer et al. (2010a; geometric

shows the pre-stained protein molecular weight standards along with the molecular weight in kDa.

mean = 0.7 U/mg protein) for the expression of ER retained prBChE from transgenic N. benthamiana.

Our findings are generally consistent with those of (Geyer et al., 2010b) who showed that approximately 50% of their prBChE purified from N. benthamiana leaf homogenates was enzymatically active tetramer. Like (Schneider et al., 2014b), the presence of a FLAG tag on the N-terminus of our prBChE did not affect its enzymatic activity or ability to be purified. Although we observed only tetrameric prBChE on native gels (**Figure 11B**) in

FIGURE 10 | (A) Coomassie stained and (B) Western blot of Native gel analysis of denatured prBChE, compared to denatured equine BChE control. (A) Coomassie stained gel, lane 1: 1 µg equine BChE control under native conditions, lane 2: 1 µg of denatured equine BChE control under reducing conditions, lane 3: 1 µg prBChE under native conditions, lane 4: 1 µg of denatured prBChE under reducing conditions. (B) Western blot analysis developed with 0.5 µg equine BChE control and prBChE using 1:200 mouse anti-BChE antibody and 1:2,000 goat anti-mouse HRP conjugated antibody.

all of our extracts (ER, AWF, and Apo), we cannot compare our findings to those of (Schneider et al., 2014b) because of significant different between our protocols. For example, the protocols of (Schneider et al., 2014b) used non-reducing gel electrophoresis to estimate oligomerization status while we used native gels. In addition, different viral-based expression vectors were used. At present, there is no clear explanation for these differences in oligomeric status and yield between our findings and (Schneider et al., 2014b). Additional experiments are underway to identify the factor(s) responsible for these differences.

A more fundamental question is how monomers of an 85 kDa mammalian enzyme oligomerize into tetramers in the milieu of the plant ER. Although determining the mechanism of prBChE tetramerization in planta is beyond the scope of this report, two possible explanations should be considered. First, oligomerization of mammalian proteins like hBChE may tetramerize in the ER or during transport through the secretory pathway. This can be supported by N-glycan profiling of the purified prBChE, which shows 25% of its N-glycans with high mannose structures (**Figure 6**). Alternatively, prBChE dimers may form tetramers as cellular contents are released by homogenization into the extraction environment. Regardless of which possibility is correct, both possibilities require that the entropic penalty of organizing prBChE into highly enthalpic tetramers, be paid by reducing ordered water molecules in and around prBChE as it proceeds along the assembly pathway (Ali and Imperiali, 2005; Fatmi and Chang, 2010). Furthermore, whether in plant or mammalian cells, the oligomerization of proteins is highly complex, sensitive to numerous intracellular and extracellular factors such as; protein concentration, temperature, pH, hydrogen and ionic bonding, hydrophobic interactions, phosphorylation, domain swapping and ligand binding (Ali and Imperiali, 2005; Gotte and Libonati, 2014). While BChE dimers are formed from intermolecular disulfide bonds at Cys571, tetramer assembly involves non-covalent affects and interactions such as those

described above. A priori, this makes tetramerization of BChE highly sensitive to these factors and difficult to predict and control. Interestingly, the pH of the ER and trans-Golgi network has been found to be a gradient from the ER to the plasma membrane of 7.2 to 5.2 in mammals (Paroutis et al., 2004) and 6.8 to 5.2 in plants (Bassil et al., 2012). Whether a pH gradient contributes in any way to BChE tetramerization is a matter of speculation at this time.

Regarding ligand binding, it has been well documented that hBChE contains a C-terminal tetramerization domain that interacts non-covalently with small, naturally occurring PRP 12 to 21 residues long in vivo the to promote tetramerization (Nicolet et al., 2003; Dvir et al., 2004; Pan et al., 2009; Biberoglu et al., 2012). It was further shown that these PRPs do not merely catalyze tetramerization but become part of the BChE tetrameric complex. In another study, synthetic PRPs ranging from 15 to 50 residues were used to promote in vitro tetramerization of CHO cell-derived rBChE in a concentration- , temperature-, and time-dependent manner (Larson et al., 2014). These results are consistent with studies on the oligomerization of multi-subunit enzymes that showed that ligand binding lowers the conformation transition barrier and helps the protein conformation shift from its inactive to its active form (Ali and Imperiali, 2005; Fatmi and Chang, 2010). This interaction may help explain the role of PRPs in the tetramerization of rBChE from mammalian sources.

The implications of these studies are important for those planning to use plants to express prBChE in that it suggests the tetramerization of hBChE is dependent on its non-covalent interaction with a PRP, the likes of which have not been observed in plants. Although hydroxyproline-rich, O-linked, glycopeptides (HRGP) are found in both monocotyledonous and dicotyledonous plants, they are largely interspersed repeats of hydroxyproline (Albenne et al., 2009), not the tandem repeats found in the PRPs associated with native human BuChE. Another interesting feature of plant HRGPs is that the hydroxyproline repeats are frequently interspersed with charged amino acids (e.g., lysine and glutamic acid) making them sensitive to changes in pH. The fact that tetrameric prBChE can be detected in N. benthamiana leaf extracts (Geyer et al., 2010a,b; Schneider et al., 2014b) raises the possibility that plant HRGPs can substitute for mammalian PRPs to promote tetramerization of prBuChE dimers. We hypothesize that the plant ER can pay the entropic penalty of oligomerization by drawing upon its vast array of processes, mechanisms, signals and possibly HRGPs, to tetramerize prBChE in a concentration-, temperature-, and pHdependent manner. Studies are currently underway to investigate these independent variables on oligomerization of BChE in plants tissues and plant cell suspension cultures.

### CONCLUSION

The objective of this study was to investigate methods for expressing active, tetrameric prBuChE in N. benthamiana leaf homogenates and AWF with the goal of increasing its desirable PK properties. We were able to produce prBChE variants using TRBO viral expression vector and purified to near homogeneity using affinity chromatography. The pure prBChE was found to be a tetramer protein as shown by Coomassie stained SDS-PAGE gels and native gel electrophoresis. We understand that there are other strategies for making prBChE a "biobetter" therapeutic using glycan engineering of plant N-linked glycans. For example, the galactose residues on the terminal N-linked glycans on prBChE can serves as substrate for in vitro (Malekan et al., 2013), or in vivo (Paccalet et al., 2007; Schneider et al., 2014a) sialylation to help extend the MRT of prBChE in human blood. Also, N. benthamiana fucosyl-transferase and xylosyl-transferase (1FT/XT) knock-down lines can be used to produce prBChE with reduce immunogenicity and/or allergenicity (Schneider et al., 2014b). While the 3XFLAG was used to facilitate purification of prBChE, it can be removed enzymatically prior to its use as an injectable therapeutic to reduce risk of immunogenicity. Lastly, the expression of biopharmaceuticals like prBChE in plant cell suspension cultures using well-established fermentation technology may be the shortest path to achieving scalability, affordability and regulatory approval of a plant-made biopharmaceutical like prBChE. We believe that all expression, purification and production options should be explored to help meet the global demand for the next generation of safe, efficacious and affordable biopharmaceuticals derived from plants.

# AUTHOR CONTRIBUTIONS

fpls-07-00743 June 15, 2016 Time: 17:30 # 12

Conceived and designed the experiments: KM, SA, KK, SN, RR, CL, AD, and BWF. Preformed the experiments: SA, BH, AG, MSH, AMT, MLP, and LA. Analyzed the data: SA, KM, KK, SN, RR, AG, and CL. Wrote the paper: SA, KM, RR, and AG. All authors read, revised, and approved the MS.

#### FUNDING

The work presented was funded by Defense Advanced Research Projects Agency (DARPA; #HR0011-12-1-0011), Defense Threat

#### REFERENCES


Reduction Agency (DTRA; #HDTRA1-15-1-0054) and National Science Foundation (NSF; #IIP-1343481). Neither funding resources had a decisive role in the design, execution of the experiments, or the interpretation of the results.

# ACKNOWLEDGMENTS

The authors would like to acknowledge Dr. Doug Cerasoli (USAMRICD) for his kind gift of PEGlyated BChE protein and Dr. Lloyd Yu (Planet Biotechnology Inc.) for his advice in purification of recombinant proteins from N. benthamiana.



acetyl- and butyrylcholinesterase inhibitors. Biochemistry 32, 12074–12084. doi: 10.1021/bi00096a018


**Conflict of Interest Statement:** KM is a cofounder of Inserogen, Inc., a plantbased biotechnology startup company with a focus on the development of orphan biologics for replacement therapy.

The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Alkanaimsh, Karuppanan, Guerrero, Tu, Hashimoto, Hwang, Phu, Arzola, Lebrilla, Dandekar, Falk, Nandi, Rodriguez and McDonald. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Semicontinuous Bioreactor Production of Recombinant Butyrylcholinesterase in Transgenic Rice Cell Suspension Cultures

Jasmine M. Corbin<sup>1</sup> , Bryce I. Hashimoto<sup>1</sup> , Kalimuthu Karuppanan<sup>1</sup> , Zachary R. Kyser <sup>1</sup> , Liying Wu<sup>2</sup> , Brian A. Roberts <sup>3</sup> , Amy R. Noe<sup>3</sup> , Raymond L. Rodriguez <sup>4</sup> , Karen A. McDonald<sup>1</sup> and Somen Nandi <sup>4</sup> \*

<sup>1</sup> Chemical Engineering and Materials Science, University of California, Davis, Davis, CA, USA, <sup>2</sup> Arcadia Biosciences, Davis, CA, USA, <sup>3</sup> Leidos, Inc., Frederick, MD, USA, <sup>4</sup> Global HealthShare®, Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA

An active and tetrameric form of recombinant butyrylcholinesterase (BChE), a large and complex human enzyme, was produced via semicontinuous operation in a transgenic rice cell suspension culture. After transformation of rice callus and screening of transformants, the cultures were scaled up from culture flask to a lab scale bioreactor. The bioreactor was operated through two phases each of growth and expression. The cells were able to produce BChE during both expression phases, with a maximum yield of 1.6 mg BChE/L of culture during the second expression phase. Cells successfully regrew during a 5-day growth phase. A combination of activity assays and Western blot analysis indicated

Edited by: Kazuhito Fujiyama, Osaka Univeristy, Japan

#### Reviewed by:

Guotian Li, University of California, Davis, USA Nozomu Koizumi, Osaka Prefecutre University, Japan Jochen Büchs, RWTH Aachen University, Germany

#### \*Correspondence:

Somen Nandi snandi@ucdavis.edu

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 12 December 2015 Accepted: 17 March 2016 Published: 31 March 2016

#### Citation:

Corbin JM, Hashimoto BI, Karuppanan K, Kyser ZR, Wu L, Roberts BA, Noe AR, Rodriguez RL, McDonald KA and Nandi S (2016) Semicontinuous Bioreactor Production of Recombinant Butyrylcholinesterase in Transgenic Rice Cell Suspension Cultures. Front. Plant Sci. 7:412. doi: 10.3389/fpls.2016.00412 production of an active and fully assembled tetramer of BChE.

Keywords: butyrylcholinesterase, rice amylase 3D, inducible promoter, plant cell culture, semicontinuous culture

# INTRODUCTION

Butyrylcholinesterase (BChE, EC 3.1.1.8) is a native human serine hydrolase enzyme that has been shown to function as a bioscavenger against various organophosphorus nerve agents, with both prophylactic and therapeutic applications (Lenz et al., 2005). BChE is a complex tetrameric protein comprised of four identical 85 kDa monomers, each with 9 N-linked glycosylation sites (Lockridge, 2015). BChE can be purified from expired human blood plasma, but low yields and high costs (anticipated >US\$10,000 per treatment dose) limit its use (DARPA, 2012). Therefore, there is a pressing need for a cost-effective platform for recombinant production of BChE.

Recombinant BChE has been successfully produced in several expression systems, including transgenic goats and mice (Huang et al., 2007), insect cells (Brazzolotto et al., 2012), transgenic plants (Geyer et al., 2010), transient expression in plants (Schneider et al., 2014a), and CHO cells (Ilyushin et al., 2013; Terekhov et al., 2015). However, a major limitation in the production of recombinant BChE is the need to produce tetrameric BChE, which has a significantly longer circulatory half-life than the dimeric or monomeric forms (Duysen et al., 2002). Many of these systems show incomplete tetramerization of the molecule (Huang et al., 2007; Geyer et al., 2010; Brazzolotto et al., 2012; Schneider et al., 2014b). While there has been success with production of fully tetrameric BChE in CHO cells through coexpression of the enzyme with a proline-rich peptide (Terekhov et al., 2015), mammalian systems are susceptible to contamination with mammalian pathogens and require extensive regulatory clearances for human therapeutic use. Whole plant systems can avoid this problem, but also require specialized facilities to grow and harvest transgenic material or to transiently express foreign proteins.

Plant cell suspension cultures, however, offer many advantages over alternative expression systems for the production of human therapeutics. These include (1) a simple, low cost, animal component-free, chemically-defined medium; (2) lack of susceptibility to contamination with mammalian pathogens; and (3) the ability to perform complex post-translational modifications such as glycosylation (Huang and McDonald, 2012). Unlike with whole plant systems, plant cell cultures can make use of already existing cell culture infrastructure. Thus, a plant cell suspension culture is well suited to address the difficulties associated with production of BChE.

In particular, the use of the rice alpha amylase 3D (RAmy3D) expression system in a rice cell suspension culture enables efficient, high-level expression of foreign proteins (Huang et al., 2001). The RAmy3D system contains an inducible promoter that is activated by sugar starvation and a signal peptide that tags the protein for secretion into the culture medium. This allows for a cyclical or semicontinuous operation of the culture in which the cells are subjected to multiple phases of growth and expression by alternating between a sugar-rich and sugar-free medium. In addition, use of an inducible expression system may alleviate some of the problems associated with reduced productivity over long culture times that has been observed for plant cell cultures that utilize constitutive expression systems (Raven et al., 2015). Semicontinuous operation can further reduce culture costs by eliminating the shut down and start up time between runs that would be necessary in a batch culture system. Although RAmy3D has been successfully used for the production of other recombinant proteins in rice cell culture (Huang et al., 2002; Trexler et al., 2005; Park et al., 2010), it has not been used to produce a molecule as large and complex as BChE.

To address the need for effective, scalable, and active BChE, we have designed and studied a transgenic rice cell suspension culture for the production of BChE using the RAmy3D expression system. This study demonstrates semicontinuous bioreactor production of an active form of BChE, which with continued development, can provide a reliable system for production of this molecule as a therapeutic or prophylactic.

#### MATERIALS AND METHODS

#### Expression Vector Design and Cloning

The native human BChE (NCBI NM\_000055) coding sequence without the native secretion signal peptide was modified for expression in rice and inserted into a vector (pUC57) containing the RAmy3D promoter, signal peptide, and terminator sequences using GenScript (GenScript, Piscataway, NJ). The native human BChE coding sequence was codon optimized for expression in rice and analyzed using Visual Gene Developer (http:// visualgenedeveloper.net). The sequence was modified without any changes in the amino acid sequence of the mature protein. The overall GC content of the DNA sequence was increased from 40.2 to 51.4%, and the codon adaptation index was changed from 0.68 to 0.83. The final construct was then cloned in a binary vector (pCAMBIA2300) and incorporated via heat shock into Agrobacterium tumefaciens strain EHA 105 for transformation.

#### Transformation and Selection of Callus

The transformation was performed as described in Huang et al. (2001) with minor modifications. Callus derived from seeds of Oryza sativa cv. Taipei 309 were co-cultivated with A. tumefaciens containing the described vector at OD<sup>600</sup> = 1.0 for 20 min. The callus was transferred to semi-solid medium and incubated in the dark at 25◦C for 3 days before washing three times with a solution of 200 mg/L timentin and 250 mg/L cefotaxime. Finally, the callus was transferred to a sucrose (S) rich semi-solid selection medium ("NB") that contains N6 macronutrients (Chih-Ching et al., 1975), B5 micronutrients and vitamins (Gamborg et al., 1968), 2 mg/L 2,4-dicholorophenoxyacetic acid (2,4-D), 250 mg/L Lproline, 250 mg/L L-glutamine, 300 mg/L casein hydrolysate, 30 g/L sucrose, 1.8 g/L gelzan, and 50 mg/L geneticin as the selection antibiotic, denoted as NB+S. Previous studies show that supplementation with proline and glutamine promotes callus induction (Pawar et al., 2015).

#### Screening of Transformed Callus

Transformed callus was subjected to eight rounds of screening at increasingly larger scales in selection media containing geneticin (as the selection antibiotic). The first round of screening evaluated 310 transformation events and was performed in a 96 well plate. Callus was transferred from semi-solid NB+S to a well containing liquid NB+S (same composition as described above) and stored in an Innova 4000 incubator/shaker (Eppendorf, Inc., Hauppauge, NY) at 60 rpm and 27◦C in the dark for 5 days. Induction was performed by sterilely pipetting off the spent NB+S medium and replacing it with an equal volume of NB−S medium, which has the same composition as NB+S except that the sucrose (S) is replaced with 8 g/L mannitol. Samples of both the spent medium and biomass were collected and analyzed at 120 h after induction. After the fourth round of screening, the top 10 transformation events were used to establish shake flask cultures. Flasks with nominal volumes between 125 mL and 1 L were filled to 1/5 full and maintained at 140 rpm and 27◦C in the dark in the same incubator/shaker. Flask screenings were performed in the same manner as well plate screenings. After the 8th round of screening, a single, top-performing cell line was selected based on cell physiology (healthy appearance, light color, friable aggregates) and BChE expression stability, and suspension cultures of this line were maintained through weekly sub-culturing with fresh liquid NB+S medium.

#### Bioreactor Operation

Cultures were scaled up for operation in a 5 L bioreactor (BioFlo 3000, Eppendorf, Inc., Hauppauge, NY) equipped with a single pitched blade impeller. To obtain small and consistently sized aggregates, cultures were sieved at the time of inoculation by pressing the biomass through a 280 µm pore size stainless steel mesh sieve (Sigma Aldrich, St. Louis, MO).

The bioreactor was operated under conditions similar to those described in Trexler et al. (2002) with slight modifications. The bioreactor was maintained at 27◦C, 75 rpm (agitation speed), and 40% (air saturation) dissolved oxygen (Mettler Toledo O<sup>2</sup> sensor). The dissolved oxygen level in the culture was controlled by altering the concentration of oxygen, nitrogen, and air in the gas sparging stream, and the overall gas flow rate was maintained between 0.2 and 0.4 vvm. The oxygen uptake rate was determined by measuring the change in dissolved oxygen in the culture in the absence of aeration. The pH of the culture was monitored but not controlled. The cultures were grown under ambient light, and the conditions were identical for both growth and expression phases.

Induction of BChE expression in the bioreactor was performed using gravity sedimentation. After the biomass was allowed to settle, a peristaltic pump and sterile tube welder were used to remove the spent NB+S medium and replace it with fresh NB−S medium. For semicontinuous operation, another growth phase was initiated by following the same procedure described above, by removing the NB−S medium and replacing with NB+S medium.

#### Biomass Measurements

Four well-mixed 10 mL samples were taken every 24 h (48 h during the initial growth period) from the reactor under sterile conditions through the sampling port. One sample was set aside for protein extraction and quantification, and the remaining three were used to determine the fresh weight (FW) and dry weight (DW) of cells by washing them in 10 mL double distilled water (to remove residual sugars) on a pre-weighed 1.6 µm Binder-Free Glass Microfiber filter (Whatman GF/A 4.7 cm, GE Healthcare Life Sciences, Pittsburgh, PA). The combined filter and biomass was weighed immediately to obtain the FW, then dried in an oven at 65◦C for 24 h and weighed again to obtain the DW.

#### Sugar Analysis

The concentrations of sucrose and glucose in the culture medium were measured using YSI 2900 Biochemistry Analyzer (Xylem, Inc., Rye Brook, NY). Cell-free medium from reactor sampling was passed through a 0.22 µm syringe filter to remove any remaining cells and stored at 4◦C until analysis. The YSI 2900 enzymatically measures sucrose and glucose concentrations in a given sample and can account for any interaction between the signals from each. Samples were measured at 2X and 4X dilutions with double distilled water to ensure concentrations of each sugar were within the accurate detection range of the instrument.

#### Quantification of Active Intracellular BChE

Concentration of active BChE was measured separately in the medium (by analyzing crude culture medium) and associated with the cell mass (by analyzing the crude cell extract). The cell extract was prepared by grinding the callus in a 1:1 ratio in cold homogenization buffer (100 mM sodium phosphate, 100 mM NaCl, pH 7.4) using a Tissue Tearor (Biospec Products, Bartlesville, OK) operated at maximum speed for 30 s. The homogenized sample was then centrifuged a 14,000 g for 10 min, and the supernatant was saved in a new tube and stored at 4◦C until analysis.

Active BChE concentration in each sample was measured using a modified Ellman activity assay (Ellman et al., 1961). The assay monitors the hydrolysis of butyrylthiocholine in the presence of Ellman's reagent in 100 mM sodium phosphate buffer, pH 7.4. This reaction was monitored using a spectrophotometer (SpectraMax 340PC, Molecular Devices, Sunnyvale, CA) to measure the absorbance at 405 nm over a period of 3 min at 25◦C. Samples were concentrated (with a 30 kDa ultrafiltration membrane) or diluted to fall within the accurate detection range of 0.5–1.0 ng BChE/µL sample. Once the concentration of BChE from a sample was determined, it was normalized to the mass of fresh weight biomass, volume of extraction buffer, or volume of total culture.

# Quantification of Total Soluble Protein

The total soluble protein (TSP) in a sample was measured using a Bio-Rad Protein Assay kit (Bio-Rad, Hercules, CA), which is based on the method of Bradford (Bradford, 1976). The assay was performed per manufacturer's instructions. Each sample was analyzed in triplicate, including protein standards used to generate a standard curve.

#### Gel Electrophoresis and Immunoblotting

SDS-PAGE was performed using a 4–15% gradient gel, and native-PAGE using a 7.5% gel (Mini-PROTEAN precast gels, BioRad, Hercules, CA). Western blotting was performed using a 1:200 dilution of mouse anti-BChE IgG as the primary antibody and a 1:2000 dilution of goat anti-mouse IgG-HRP as the secondary antibody (Santa Cruz Biotechnology, Dallas, TX), and developed by incubation with 3,3′ ,5,5′ -tetramethylbenzidine (TMB) substrate. PEGylated recombinant human BChE derived from transgenic goats (PharmAthene, Inc, Annapolis, MD) and native equine serum BChE (Sigma Aldrich, St. Louis, MO) were used as controls.

# RESULTS

#### Transformation and Screening

The rice-optimized BChE gene construct was cloned into the RAmy3D (Huang et al., 2001) expression system (**Figure 1**) and subsequently into A. tumefaciens for gene delivery by cocultivation. Stable integration of incoming recombinant DNA into cellular DNA is largely a random process, and accordingly, the sites of integration are dispersed throughout the genome. The ability to backcross and screen independent transgenic events can produce more stable arrangements and expression of the transgene in whole plants (Chih-Ching et al., 1975; Park et al., 2010). For transgenic cell lines, however, the process is more complex. Although primary calli can be selected for a particular antibiotic resistance and screen for the expression of a particular recombinant protein, genetic and epigenetic changes that occur during multiple cycles of cell differentiation and dedifferentiation can produce somaclonal variation in the calli (Gamborg et al., 1968). These somaclonal variants can exhibit a wide range of morphological and physiological phenotypes, including variations in the expression of the transgene.

To meet our goal for stable protein expression over many generations of microcalli propagation, extensive screening was performed to identify the lines that provide optimal protein production. Over 1000 transformants, from 310 independent transformation events were initially selected and grown on semi-solid selection medium. Of these, 105 events produced a detectable amount of BChE, with varying levels of expression (data not shown). **Figure 2** shows the combined amounts of active secreted and cell-associated BChE from the top 20 transformation events during the first screening. After 8 rounds of screening, one cell line (identified as line "9–2") was selected based on the expression stability and cell-line physiology for continued development and scale up, and was used for all remaining analyses described in this study.

#### Cell Growth, Oxygen Uptake, Sugar Consumption, and BChE Expression Kinetics

The semicontinuous bioreactor culture was operated for 31 days and underwent two cycles each of growth and expression. **Figure 3A** shows the growth of the biomass over the duration of the culture. The first growth phase (Growth 1) had an initial biomass concentration of 0.21 ± 0.07 g DW/L of culture and lasted 16 days, while the second growth phase (Growth 2) had an initial biomass concentration of 2.37 ± 0.06 g DW/L and lasted 5 days. Growth 1 exhibited a long lag phase of 4 days followed by a long exponential phase of 12 days prior to induction, while Growth 2 had a lag of <1 day and a 5 day exponential phase. This difference is likely due to the difference in the initial biomass concentration during each growth phase. During each expression phase, where NB+S was removed and replaced with NB−S, the biomass concentration immediately dropped as some of the biomass was pumped out of the reactor along with the spent medium. However, at the beginning of expression, the biomass concentration increased as the cells received fresh medium components such as nutrients and amino acids. The biomass concentration soon leveled off and began to drop as the cells starved from sugar deprivation, until a new growth phase was initiated. **Table 1** summarizes the cell growth kinetics for both growth phases. The maximum specific growth rate (µmax) increased from 0.15 ± 0.01 day−<sup>1</sup> during Growth 1 to 0.22 ± 0.01 day−<sup>1</sup> during Growth 2, which correlated to a decrease in doubling time from 4.7 ± 0.3 to 3.2 ± 0.1 days.

The oxygen uptake rate (OUR) of the culture was also measured throughout operation as an indicator of the culture's metabolic activity (**Figure 3B**). A rise in OUR correlates with cell growth, and a drop correlates with the onset of the stationary phase. Our previous studies with Nicotiana benthamiana cell culture showed that the maximum value of OUR correlated with late exponential phase, and induction at this stage can lead to higher expression of a heterologous target protein (Huang et al., 2010). Thus, each expression phase was initiated as the OUR began to level off during the growth phase. However, the OUR reached a maximum 1 day after induction in both expression phases, and this may be explained in the same way as the initial rise in biomass concentration. After this initial rise, the OUR fell as the cells starved, until a new growth phase was initiated.



Key: µmax , maximum specific growth rate; τD, doubling time; OUR, oxygen uptake rate; DW, dry weight of callus.

**Table 1** shows that maximum OUR increased from 0.52 mmol O2/(L·h) in Growth 1 to 1.25 mmol O2/(L·h) in Growth 2, which is expected because this value does not account for the increase in biomass density. The maximum specific OUR actually decreases from 0.29 mmol O2/(g DW·h) in Growth 1 to 0.18 mmol O2/(g DW·h) in Growth 2.

During growth phases, the sucrose concentration gradually decreased as it was hydrolyzed to produce equal concentrations of glucose and fructose, which can be consumed by the cells. Throughout the culture operation, the rate of hydrolysis of sucrose was faster than the rate of glucose consumption, which can be seen as a gradual decrease in sucrose concentration and increase in glucose concentration during both growth phases (**Figure 3C**). Both growth phases displayed the same trend, but Growth 2 showed the pattern on a shorter time scale, which is likely due to the increased biomass density. At the beginning of both expression phases, the sugar rich (+S) medium is removed and replaced with sugar free medium (−S), causing the concentration of both sugars to drop to and remain at 0 g/L for the duration of the expression phase.

Active BChE concentration rose gradually over time during each expression phase (**Figure 3D**). The first expression phase (Expression 1) produced a maximum active BChE concentration of 21.4 ± 2.3 µg/g FW biomass, while the second expression phase (Expression 2) produced a maximum of 25.2 ± 1.9 µg/g FW biomass. Expression 1 had a lag time between induction and expression of about 2 days, and reached its maximum expression level 4 days after induction. During Growth 2, the concentration of BChE decreased gradually until it was no longer detectable. Expression 2 had a lag of <1 day, and reached a peak around 4 days after induction. These differences may be caused by differences in the timing of induction, the culture density, and the physiological condition of the cells. During both expression phases, the majority of BChE produced was associated with the cell mass. Negligible amounts of BChE were detected in the medium.

**Table 2** summarizes the kinetic parameters of BChE expression. While the expression level on a per weight basis was similar for both expression phases (21.4 and 25.2 µg/g FW), the increased biomass density during Expression 2 lead to a much higher amount of BChE produced per liter of culture (an increase from 0.72 mg/L culture during Expression 1 to 1.64 mg/L in Expression 2). Expression 2 also had a much higher maximum volumetric productivity (based on the cycle duration including growth and expression phases) due to the absence of a long growth lag phase and increased biomass density in Growth 2 as compared to Growth 1. Finally, the ratio of BChE to TSP increased from the first to second expression phase; this may be

due to improved adaptation during subsequent induction cycles which resulted in increased expression of BChE.

## Electrophoresis and Immunoblotting

**Figure 4A** shows a Western blot under reducing conditions of the cell-associated BChE from samples obtained during Expression 1 and Growth 2. Equal volumes (20 µL) of crude cell extract were loaded into each lane. These extracts were obtained by grinding cells in a 1:1 ratio of biomass to buffer, so the intensity of the BChE band corresponds to the concentration of BChE associated with the cells at the time each sample was taken. Under reducing conditions, a distinct band is seen around 85 kDa, which corresponds to the predicted size of a monomer of BChE. Samples taken immediately before and after induction and 2 days after induction (lanes 1–3) show no visible band, which corresponds to the low BChE activity as seen in the activity data shown in **Figure 3D** (<5 µg/g FW). The band in lane 4 corresponds to a higher value of BChE activity (21 µg/g FW). However, while we see a decrease in BChE activity from day 20 to 21 (a drop from 21 to 18 µg/g FW), there is an increase in the intensity of the BChE band seen in the Western blot in lanes 5 and 6 (which correspond to samples taken immediately before and after initiation of Growth 2 on day 21). This may indicate that a portion of the BChE detected in the Western blot is inactive.

To determine the oligomerization status of the rice cell culture-produced BChE, we performed Western blot analysis under native conditions on a crude cell extract from Expression 1 (**Figure 4B**). Lane 2 contains 200 ng of purified equine serum BChE (as human BChE has been commercially unavailable for over a year), which is about 440 kDa in its tetrameric form. The cell extract shows two bands, one smaller and one larger than the equine control. The smaller product is near the expected size of tetrameric BChE (340 kDa), while the other may be an aggregation product.

# DISCUSSION

The size and complexity of BChE has been a major impediment in its production in any recombinant expression host. Despite high expression levels (up to 5 g/L of milk), transgenic goats are able to produce primarily dimeric BChE, which exhibits a reduced circulatory half-life (∼2 min) in vivo (Huang et al., 2007). Many attempts have been made to solve this problem, including PEGylation (Sun et al., 2013) or sialylation (Ilyushin et al., 2013) to increase this half-life, and addition of a prolinerich peptide to encourage tetramerization (Larson et al., 2014; Terekhov et al., 2015). Recent work by Schneider et al. also reports difficulty in secretion of BChE oligomers produced by transient expression in N. benthamiana, despite the molecule's ability to tetramerize in a crude cell extract (Schneider et al., 2014b). Recombinant tetrameric BChE has been produced in other cell culture types and reached up to 70 mg/L of culture (Terekhov et al., 2015), but these cultures have been limited by a comparatively high cost of culture operation and increased potential for contamination from human pathogens.

Previous work by our group has demonstrated successful semicontinuous production of a fully active alpha-1-antitrypsin under control of the RAmy3D expression system in a transgenic rice cell culture (Huang et al., 2001; Trexler et al., 2002, 2005). In the current study, we successfully produced an intact and active form of BChE, a larger and more complex glycoprotein therapeutic. A maximum of 1.6 mg/L of culture was produced during the second expression phase in a relatively non-optimized system. We expect that further refinements of the operational strategy will increase expression levels to those of other cell culture systems. For example, Park et al. have demonstrated a fedbatch operational strategy, in which the culture is supplemented with a concentrated solution of amino acids prior to sugar depletion, that can enhanced product yields up to 1.8-fold compared to induction via medium exchange (Park et al., 2010).

One major advantage of our system is that we can produce predominantly active and fully assembled tetrameric BChE. This underscores the importance of this system as a safe and cost-effective method for production of human biotherapeutics. In addition to growing on an inexpensive culture medium, long-term semicontinuous operation of the culture reduces the need for long seed trains and minimizes turn-around time, clean-in-place and steam-in-place operations, chemicals, and energy. Furthermore, the regulatory pathway for plant-based recombinant biologics for human therapeutic use has now been established with the production and regulatory approval of taliglucerase alfa (ElelysoTM) produced in carrot cell suspension in batch culture for the treatment of Gaucher's disease (Maxmen, 2012).

After the initial inoculation, the culture experienced a long lag phase in cell growth. The process of sieving the callus through a stainless steel mesh to obtain small cell aggregates may be physically stressful for the cells, thus requiring a longer initial growth period to acclimatize. This may be part of the reason that Growth 2, which involved the same biomass but no sieving immediately before the growth phase, did not exhibit the same lag phase and reached a maximum OUR value within 5 days. During expression phases, the culture reached a maximum expression level of BChE around day 4 after induction. Thus, after an initial acclimatization period, the culture can operated by alternating between 5-day growth phases and 4-day expression phases to maximize BChE productivity over longer periods.

The negligible release of BChE in the culture medium during bioreactor operation (in comparison to the screening stage, where a significant portion of BChE was secreted into the medium) may be due to the size of both the protein (340 kDa) and the larger size of the cell aggregates in comparison to the screening stage. For BChE to be released into the medium, BChE must not only be secreted from the cell, but must also pass through the cell wall and diffuse through the cell aggregate. The tetrameric form of BChE is about 50–60 nm (Lockridge et al., 2011), while the average pore size of the plant cell wall is 35–50 nm (Carpita et al., 1979) in differentiated plant cells. If tetrameric BChE is produced inside the cell and secreted in its fully assembled form, the protein may be trapped inside the cell wall or within the aggregate and only released during homogenization. Even if the tetramer is assembled after secretion, the monomers or the fully assembled tetramer still must diffuse through the cell aggregate to reach the culture medium. Because early screening experiments were done at small scale, the aggregates could be easily broken apart before expression level studies. Here, as shown in **Figure 2**, we saw roughly 30–60% of the total BChE secreted into the medium for most cell lines. However, with each subsequent screening, we saw a reduction in the proportion of BChE secreted into the medium. This can be explained by the fact that each subsequent screening was performed at an increasingly larger scale, which made it more difficult to break apart large aggregates. These larger aggregates provide a more challenging route through which secreted BChE must diffuse into the culture medium, and can explain why the proportion of secreted BChE dropped to nearly zero in later cultures.

The issue of low secretion levels may be addressed by decreasing the size of cell aggregates through alteration of medium composition or operating parameters (particularly, agitation, aeration rates, and length of growth and expression phases). However, it is also possible that oligomeric BChE is not able to correctly pass through the secretory pathway completely, as has been seen in other plant systems (Schneider et al., 2014b), and early screenings detected a small amount of monomeric BChE that was able to secrete effectively. To address this possibility, we are developing an alternate semicontinuous operation format similar to that described by Huang et al. (2010), in which a portion of the biomass is harvested after an expression phase, while the remaining biomass remains in the reactor as inoculum for the next growth phase.

# CONCLUSIONS AND FUTURE PROSPECTS

Active recombinant BChE was produced in a transgenic rice cell suspension culture using an inducible gene expression system. Two complete phases of cell growth and BChE expression were performed, and the cells were able to express milligram quantities of active BChE during both expression phases and successfully regrow during a growth phase. Theoretically, these cultures could be operated indefinitely, and our goal is to evaluate the behavior and productivity of the culture during long-term (several months) operation, and if necessary by keeping a portion of the cells as inoculum to the medium in subsequent cycles. Further development of the culture and operation modes will aim to increase the amount of BChE produced and secreted into the culture medium, and to perform a more comprehensive

TABLE 2 | Maximum values of butyrylcholinesterase (BChE) expression kinetic parameters for two expression phases.


Key: FW, fresh weight of callus; TSP, total soluble protein.

FIGURE 4 | Western blot analysis of butyrylcholinesterase (BChE) produced in rice cell culture. (A) Western blot under reducing conditions of cell-associated samples from before and after medium exchanges. Each lane contains 20 µL of crude cell extract. Lane MW: molecular weight ladder; lane 1, day 16, immediately before start of Expression phase 1; lane 2, day 16, immediately after start of Expression Phase 1; lane 3, day 18; lane 4, day 20; lane 5, day 21, immediately before start of Growth phase 2; lane 6, day 21, immediately after start of Growth phase 2; lane 7, day 23; lane 8, day 25; lane 9, control, 900 ng purified equine BChE. (B) Western blot under native conditions of BChE. Lane MW: molecular weight ladder; lane 1: 52 U (∼200 ng) active BChE from an intracellular sample from day 3 of Expression phase 1; lane 2: control, 200 ng purified equine BChE.

characterization of BChE, including determination of the serum half-life and the glycosylation profile of the product.

#### AUTHOR CONTRIBUTIONS

JC, led and designed experiments and wrote and edited the manuscript. BH, led and designed experiments. KK, conceptualized and assisted with performing and designing experiments. ZK, assisted with experiments. LW, performed experiments. BR, conceptualized. AN, conceptualized. RR, conceptualized and edited the manuscript. KM, conceptualized,

#### REFERENCES


designed experiments, and edited the manuscript. SN, conceptualized, designed experiments, and edited the manuscript.

#### ACKNOWLEDGMENTS

Funding for this project was provided by Leidos, Inc. (Frederick, MD) and by the National Institutes of Health, National Institute of General Medical Sciences (NIGMS-NIH) (Grant Number T32- GM008799). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIGMS or NIH.

of transgenic animals to protect against organophosphate poisoning. Proc. Natl. Acad. Sci. U.S.A. 104, 13603–13608. doi: 10.1073/pnas.07027 56104


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

KM is a co-founder of Inserogen, Inc., a plant-based biotechnology company with a focus on the development of orphan drugs for replacement therapy.

Copyright © 2016 Corbin, Hashimoto, Karuppanan, Kyser, Wu, Roberts, Noe, Rodriguez, McDonald and Nandi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Fusion between Domains of the Human Bone Morphogenetic Protein-2 and Maize 27 kD γ-Zein Accumulates to High Levels in the Endoplasmic Reticulum without Forming Protein Bodies in Transgenic Tobacco

#### *Edited by:*

Eugenio Benvenuto, ENEA, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### *Reviewed by:*

Rima Menassa, Agriculture and Agri-Food Canada, Canada Eva Stoger, University of Natural Resources and Life Sciences, Austria Guo-Hao Lin, University of Michigan School of Dentistry, USA

> *\*Correspondence:* Emanuela Pedrazzini

pedrazzini@ibba.cnr.it

#### *Specialty section:*

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

*Received:* 14 December 2015 *Accepted:* 07 March 2016 *Published:* 24 March 2016

#### *Citation:*

Ceresoli V, Mainieri D, Del Fabbro M, Weinstein R and Pedrazzini E (2016) A Fusion between Domains of the Human Bone Morphogenetic Protein-2 and Maize 27 kD γ-Zein Accumulates to High Levels in the Endoplasmic Reticulum without Forming Protein Bodies in Transgenic Tobacco. Front. Plant Sci. 7:358. doi: 10.3389/fpls.2016.00358 Valentina Ceresoli 1, 2, 3, Davide Mainieri <sup>1</sup> , Massimo Del Fabbro2, 3, Roberto Weinstein2, <sup>3</sup> and Emanuela Pedrazzini <sup>1</sup> \*

1 Istituto di Biologia e Biotecnologia Agraria, Consiglio Nazionale Delle Ricerche, Milano, Italy, <sup>2</sup> Dipartimento Scienze Biomediche, Chirurgiche e Odontoiatriche, Università Degli Studi di Milano, Milano, Italy, <sup>3</sup> IRCCS Istituto Ortopedico Galeazzi, Milano, Italy

Human Bone Morphogenetic Protein-2 (hBMP2) is an osteoinductive agent physiologically involved in bone remodeling processes. A commercialized recombinant hBMP2 produced in mammalian cell lines is available in different clinical applications where bone regeneration is needed, but widespread use has been hindered due to an unfavorable cost/effective ratio. Protein bodies are very large insoluble protein polymers that originate within the endoplasmic reticulum by prolamine accumulation during the cereal seed development. The N-terminal domain of the maize prolamin 27 kD γ-zein is able to promote protein body biogenesis when fused to other proteins. To produce high yield of recombinant hBMP2 active domain (ad) in stably transformed tobacco plants we have fused it to the γ-zein domain. We show that this zein-hBMP2ad fusion is retained in the endoplasmic reticulum without forming insoluble protein bodies. The accumulation levels are above 1% of total soluble leaf proteins, indicating that it could be a rapid and suitable strategy to produce hBMP2ad at affordable costs.

Keywords: bone morphogenetic protein 2, endoplasmic reticulum, protein accumulation, protein bodies, γ-zein, plant factories

# INTRODUCTION

Throughout the adult life, as well as in the healing process following injuries, bone tissue is subjected to cycles of absorption and formation during the physiological modeling and remodeling. The former allows to shaping bone structure according to the loading situation, adapting cortical, and trabecular architecture to the functional needs, while the latter consists of the renewal of bone composition without changing the bone architecture. However, a massive bone regeneration may be required in pathological conditions such as in the presence of extensive bony defects caused by trauma, bone cancer, infection and necrosis, or because the remodeling process is compromised, such as in osteoporosis (Giannoudis et al., 2005).

The availability of Bone Morphogenetic Proteins (hBMPs), the osteoinductive agents in bone, as an adjunct to surgical procedures involving bone reconstruction, might reduce or avoid the need for complex and demanding surgeries, preventing major costs, and morbidity related to autograft harvesting (Garrison et al., 2007; Lo et al., 2012). Among the 20 members of hBMPs, which belong to TGF beta superfamily, only hBMP-2, -4, -6, - 7, -9, and -14 have shown promising osteoinductive properties in different injuries (e.g., long bone fracture non-unions, spinal fusion, and maxillofacial bone defects; Even et al., 2012; Lo et al., 2012; Carreira et al., 2014). hBMP2 and hBMP7 are the most characterized factors and are able to strongly induce osteoblast differentiation in different tissues (Marie et al., 2002). In 2007, an extensive survey on the cost-effectiveness of the use of recombinant hBMPs (rhBMPs) in orthopedy concluded that there is lack of evidence about the clinical effectiveness and that their use would not be cost-effective unless the price is significantly reduced, except for severe cases (Garrison et al., 2007). Over the past 10 years, evidence about the clinical efficacy of rhBMP2 has been provided by sponsored clinical studies (Burks and Nair, 2010; Kim et al., 2015). Based on preliminary clinical results in oral, maxillofacial and orthopedic surgeries, rhBMP2 is as effective as the conventional grafting according to clinical and histomorphometric parameters, and in some cases it may accelerate bone healing (Kelly et al., 2015; Lin et al., 2015; Poon et al., 2016). Moreover, rhBMP2 may decrease morbidity and improve other patient-associated outcomes, with no signs of rejection or infection (Alt et al., 2015; Lin et al., 2015; Poon et al., 2016). On the other hand, while complications and adverse events were rarely reported using rhBMPs for their specific clinical indications, pitfalls have been observed with off-label use of rhBMP2, leading to a number of adverse effects (Poon et al., 2016). Finally, some authors have still expressed concern regarding the actual effectiveness and safety of rhBMPs, highlighting the risk of potential serious complications, as reported by non-sponsored studies (Even et al., 2012).

Despite the promising clinical results and the improvements in the production procedures of recent years, the widespread therapeutic use of rhBMPs has been hindered until now, both because of the FDA limitations on their applications, and the failure to obtain high amount of pure and biologically active protein at an affordable price (Harada et al., 2012; Carreira et al., 2014).

hBMP2 is synthesized as an inactive dimer that undergoes major posttranslational modifications to become biologically active. Each of the BMP-2 monomers (405 AA) contains a cysteine-knot which is based on six cysteine residues (C296, C325, C329, C361, C393, C395) forming three intra-chain disulfide bridges (Scheufler et al., 1999). The cystine-knot scaffold confers rigidity to the structure and it is necessary to stabilize the whole protein. The hBMP2 dimers are formed by a single inter-chain disulfide bond which engages the unpaired C360 of each monomer (Scheufler et al., 1999). The key post-translational modification is the endoproteolytic cleavage by proprotein convertase downstream the R282 in the KREKR consensus site that releases the mature homodimeric hBMP2 active domain (Heng et al., 2010).

Several attempts to express recombinant active hBMPs in different heterologous systems (such as Escherichia coli, P. pastoris, Baculovirus/insect cells, and mammalian cells) have been made (Colin et al., 2008). However, the production has encountered difficulties, especially due to the impairment of protein folding, different grade or absence of post-translational modifications and low protein stability (Hazama et al., 1995; Pulkki et al., 2011; Park et al., 2014.). In 2002 the first recombinant hBMP2 obtained from CHO cell lines was approved by the FDA (FDA approval number: P000054). Although mammalian cells are disadvantageous in terms of complexity and cost, the recombinant hBMP2 (rhBMP-2) and rhBMP-7 by Medtronic are so far the only ones to be marketed.

During the last two decades, plants have emerged as one of the most promising general production platforms for biologics, and nowadays plant-based expression systems are accepted as robust, scalable and cost-efficient platforms for the production of recombinant proteins of pharmaceutical interest, including antibodies, blood substitutes, vaccines, and growth factors (Ma et al., 2003; Fischer et al., 2004; Stoger et al., 2005; Melnik and Stoger, 2013; Sack et al., 2015).

A previous attempt to produce rhBMP2 in tobacco plants has been made (Suo et al., 2006). The level of accumulation of a recombinant protein in transgenic plants is protein specific and strongly influenced by the subcellular compartment of destination (Vitale and Pedrazzini, 2005); thus, search for the best subcellular compartment for the protein of interest represents a major issue in the effort to maximize production (Vitale and Pedrazzini, 2005; Hofbauer and Stoger, 2013). Several targeting strategies have been developed to improve protein accumulation in plant cells and one of the most promising is to exploit the seed storage protein determinants deputed to the formation of large oligomers that accumulate in the endoplasmic reticulum of maize endosperm, resulting in protein body (PB) biogenesis. The 27 kD γ-zein (hereafter zein) is a maize storage protein belonging to the prolamin class and is able to induce PB biogenesis even when expressed in vegetative tissues of transgenic plants (Shewry et al., 1995; Vitale and Ceriotti, 2004). Its N-terminal domain, characterized by eight repeats of the hexapeptide VHLPPP and seven cysteine residues is necessary for retention and deposition into the ER (Geli et al., 1994; Mainieri et al., 2014). The cysteine residues form inter-chain disulphide bonds, leading to assembly into the very large PBs. These polymers are therefore insoluble unless treated with reducing agents (Mainieri et al., 2004, 2014; Pompa and Vitale, 2006). This domain has been used to allow accumulation of recombinant fusion proteins both in transgenic plants and in mammalian cells (Mainieri et al., 2004; Llop-Tous et al., 2011).

In order to investigate new cost-effective approaches to express the hBMP2 in tobacco transgenic plants, we produced a chimeric protein (named zein-hBMP2ad), consisting of the C-terminal active domain of hBMP2 fused to the N-terminal domain of 27 kD γ-zein. We show that the prolamin domain promotes higher accumulation level of hBMP2ad compared to the native hBMP2 (hBMP2nat). zein-hBMP2ad assembly, post-translational modifications and ability to induced PB biogenesis were analyzed.

# MATERIALS AND METHODS

# Plasmid Construction

The constructs used in this study are shown in **Figure 1**, and were prepared according to standard molecular techniques (Sambrook et al., 1989).

A single chimeric construct including the N-Terminal of 27 kDa γ-zein, a DNA linker, a Thrombin cleavage site, the complete coding sequence of human BMP2 and a FLAG epitope was purchased from GeneCust (GeneCust Europe, Laboratoire de Biotechnologie du Luxembourg S.A.) in a pUC57 vector in order to obtain the two zein-hBMP2ad and hBMP2nat constructs with specific subcloning experiments.

The chimeric gene zein-hBMP2ad was obtained by double digestion with XbaI and PstI and the insertion of the excised fragment in a pDHA vector. To reach the final sequence this plasmid was digested with AflII in order to excise the signal peptide and the pro-peptide of hBMP2. The resulting construct was used for transient expression in tobacco protoplasts. Fulllength hBMP2 (hBMP2nat) was obtained by digestion with SalI and PstI. The fragment was then subcloned in a pDHA vector and used for transient expression in tobacco protoplasts.

For stable plant transformation, the EcoRI fragments excised from the two pDHA plasmids, containing the expression cassettes, were subcloned into EcoRI-linearized pGreenII 0179 (http://www.pgreen.ac.uk, John Innes Centre, Norwich, Norfolk, UK) that carries the hygromycin selectable marker gene. These constructs, called pGreen-ER and pGreen-N, were used to transform A. tumefaciens strain GV3101 by electroporation, as described by Shen and Forde (1989).

#### Antibodies and Recombinant Protein

Rabbit polyclonal anti-FLAG (1:2000 Sigma Aldrich, St. Louis, MO, USA); rabbit polyclonal anti-tobacco BiP (1:10,000 dilution, Pedrazzini et al., 1997); rabbit polyclonal antiendoplasmin/GRP94 (1:1000 dilution; Klein et al., 2006); goat anti-rabbit IgG-peroxidase conjugate (1:16,000, Pierce Biotechnology (Rockford, IL, USA). FLAG-Bap recombinant fusion protein (Sigma Aldrich, St. Louis, MO, USA).

# Plant Material

Nicotiana tabacum plants (cv. Petit Havana SR1) as well as transgenics were cultured in sterile conditions on Murashige and Skoog (MS) basal salt medium (Duchefa Biochemie, Haarlem, the Netherlands) containing 0.8% plant agar (Duchefa Biochemie) or in soil in a growth room at 25◦C with a 16-h/8-h light/dark cycle.

# *A. tumefaciens*-Mediated Transformation

Young tobacco leaves were excised from axenically grown wildtype plants and cut into about 1 cm2 leaf discs. The discs were placed for 5 min in a culture of A. tumefaciens carrying the plasmid of interest, and then incubated at 25◦C in 16 h of light on MS salts medium (Duchefa Biochemie), containing 3% sucrose and 0.8% phyto agar, underside down. After 2 days, the leaf discs were transferred to a shoot generation medium (1/2 MS salts supplemented with 30 g/l sucrose, 0.1 mg/l α-naphthalene acetic acid, 0.1 mg/l 6-benzylaminopurine, 50µg/ml hygromycin, 100µg/ml carbenicillin and 250µg/ml cefotaxime, and 0.8% phyto agar) in order to promote callus growth, select transformed plants and to prevent further growth of Agrobacterium. Elongated shoots were excised from calli and transferred on to 1/2 MS agar supplemented with 0.1 mg/L indole-3-acetic acid, 50 mg/L hygromycin and 100 mg/L carbenicillin in order to promote roots formation. Transformed plants were grown at 25◦C in 16 h of light in axenic conditions without antibiotics and propagated every 5–6 weeks.

# Transient Expression Tobacco Protoplasts and Analysis of Proteins

Protoplasts were isolated from tobacco leaves and subjected to polyethylene glycol-mediated transfection, as described by Pedrazzini et al. (1997). Forty micrograms of pDHA plasmid without insert (as a negative control) or with inserted recombinant coding sequences (pDHA-zein-hBMP2 and pDHAhBMP2nat) were used in each transfection at a concentration of 10<sup>6</sup> cells/mL. Protoplasts were then allowed to recover overnight in the dark at 25◦C in K3 medium (Gamborg's B5 basal media with minimal organics supplemented with 400 mM sucrose, 1.5 M xylose, 3 mM NH4NO3, 1 mg/l α-naphtalenacetic acid, 1 mg/l 6-benzylaminopurine and 5 mM CaCl2) before performing protein extraction.

Protoplast homogenization was performed by adding to frozen samples two volumes of ice-cold homogenization buffer (150 mM Tris-Cl, 150 mM NaCl, 1.5 mM EDTA, and 1.5% Triton X-100, pH 7.5) supplemented with Complete <sup>R</sup> protease inhibitor cocktail (Roche, Basel). For protein extraction performed in reducing condition, the buffer was supplemented also with 4% 2 mercaptoethanol. Equal amounts of each sample were denatured with Laemmli buffer, loaded on to 15% SDS-PAGE together with the Protein Molecular Weight Marker mixture (Fermentas, Vilnius, Lithuania) and electrotransferred to a polyvinylidene difluoride (PVDF) membrane (Protran Nitrocellulose Transfer Membranes, Perkin Elmer). Blots were probed with anti-FLAG mAbs and proteins were detected using a horseradish peroxidase-conjugated anti-rabbit secondary antibody (Pierce Biotechnology, Rockford, IL), followed by chemiluminescence with Super-Signal <sup>R</sup> West Pico Chemiluminescent Substrate (Pierce Biotechnology, Rockford, IL). Detection of bands was performed with the ChemiDoc MP Imaging Systems (Bio-Rad, Hercules, CA).

# Protein Extraction from Tobacco Leaves and Western Blot Analysis

Screening of tobacco transgenic lines expressing zein-hBMP2ad or hBMP2nat was performed by direct homogenization of leaves (0.1 mg, fresh weight) in Laemmli buffer at 95◦C (ratio 7:1). Twenty-five microliters of leaf homogenate from each sample (corresponding to 12.5µg of total leaf proteins) were loaded on SDS-PAGE, followed by protein blot analysis with anti-Flag antiserum.

For the extraction of total proteins, young (5–7 cm long) leaves of transgenic tobacco were homogenized in an icecold mortar with seven volumes of homogenization buffer (200 mM NaCl, 1 mM EDTA, 0.2% Triton X-100, 100 mM Tris-Cl, pH 7.8) supplemented with Complete <sup>R</sup> protease inhibitor cocktail (Roche, Basel). For protein extraction performed in reducing condition, the buffer was supplemented also with 4% 2-mercaptoethanol. The homogenate was then centrifuged at 1500 g for 10 min at 4◦C and the protein concentration of the supernatants was determined using the Bio-Rad protein assay (Bio-Rad, Hercules, CA, USA). Supernatant and, in some cases, pellet were then analyzed by protein blot. Equal amounts of total protein from each sample were denatured with Laemmli buffer and then processed as described above for protoplasts protein extracts. Densitometric analysis of bands was performed with Image Lab software (Version 4.1, Bio-rad).

#### Endoglycosidase H Treatment

Protein samples extracted either from transgenic leaves or from transfected protoplasts were incubated in Glycoprotein Denaturing Buffer (0.5% (w/v) SDS, 40 mM) for 10 min at 100◦C. Samples were brought to a volume of at least 20µl by adding of 10x G5 Reaction Buffer (50 mM sodium citrate pH 5.5) and divided in two equal aliquots. One of them was treated with 2000 Units of Endo H enzyme (New England Biolabs, Beverly, MA, USA), the other one with an equal volume of water. After 2 h incubation at 37◦C the samples were supplemented with Laemmli denaturation buffer and analyzed by either SDS-PAGE and protein blotting.

## Velocity Centrifugation on Sucrose Gradient

Young leaves of transgenic and control plants were homogenized using the above described homogenization buffer in absence of 4% 2-mercaptoethanol. The homogenate was loaded on a linear 5–25% (w/v) sucrose gradient made in 150 mM NaCl, 1 mM EDTA, 0.1% Triton X-100, 50 mM Tris-Cl, pH 7.5. An additional gradient was loaded with a mixture of protein markers containing 200µg each of cytochrome C (12.4 kDa), ovalbumin (43 kDa), BSA (66 kDa), aldolase (161 kDa), and catalase (232 kDa). After centrifugation at 200,000 g for 20 h at 4◦C in a Beckman SW40 rotor, fractions of about 650µl were collected. An equal aliquot of each fraction was analyzed by SDS-PAGE and protein blot.

#### Subcellular Fractionation on Isopycnic Sucrose Gradients

Young leaves of transgenic and wild type plants were homogenized in an ice-cold mortar with a 7:1 (v/w) ratio of homogenization buffer (100 mM Tris-Cl pH 7.8, 10 mM KCl, 12% sucrose (w/w), and Complete <sup>R</sup> protease inhibitor cocktail) containing either 1 mM EDTA or 10 mM MgCl2. Six hundred microliters of each sample were loaded on the top of 12 mL of liner sucrose gradient [100 mM Tris-Cl pH 7.8, 10 mM KCl, 16–65% sucrose (w/w)] and centrifuged at 150,000 g for 2 h at 4◦C in a Beckman SW40 rotor (Beckman, Fullerton, CA, USA). After centrifugation, 20 fractions of about 600µL were collected and an equal amount of each fraction was denatured and analyzed by SDS-PAGE and protein blot.

#### Thrombin Cleavage

Twenty-five micrograms of leaf total proteins, extracted in reducing condition from transgenic and wild type plants, were incubated with 10 units of thrombin (Amersham Biosciences, Piscataway, NJ, USA) or PBS (as control) at 22◦C for 20 h, under gentle agitation. After digestion, samples were denatured and analyzed by SDS-PAGE and protein blot.

# RESULTS

The fusion construct zein-hBMP2ad (**Figure 1**) contains the first 112 amino acids of 27 kD γ-zein, including its 19aa signal peptide, (Mainieri et al., 2004; de Virgilio et al., 2008; Virgili-López et al., 2013) followed by the C-terminal, active domain of hBMP2 (ad, 114 amino acids). The zein portion includes six out of the seven unpaired cysteine residues as well as the VHLPPP amphipathic heptapeptide. This is the same domain used in Mainieri et al. (2004) and it leads to the assembly of zeolin into PBs. A short flexible linker and a thrombin cleavage site were inserted between the fusion partners, to favor the independent folding of the two moieties and hBMP2ad purification (**Figure 1**). The full length native hBMP2 (hBMP2nat) pre-pro-sequence (which includes the hBMP2 signal peptide and the following pre-pro peptide) was also expressed, as a control. A C-terminal FLAG epitope was added to both recombinant constructs, to allow immunodetection (**Figure 1**).

# zein-hBMP2ad and hBMP2nat Are Soluble and Are Not Secreted in Tobacco Mesophyll Protoplasts

The two constructs were transiently expressed in tobacco mesophyll protoplasts under the control of an enhanced cauliflower mosaic virus (CaMV) 35S promoter. The pDHA empty vector was used as further control (Co). The influence of disulphide bonds on solubility can be tested by protoplasts homogenization with buffer containing or not the reducing agent 2-mercaptoethanol (2-ME), followed by centrifugation to separate soluble and insoluble proteins. Solubilized proteins were then denatured in the presence of SDS and 2-ME and analyzed by SDS-PAGE in reducing conditions followed by protein blot with anti-Flag antibodies. When homogenization was performed in the absence of 2-ME, zein-hBMP2ad is recovered mainly as soluble form of about 55 kD that could represent dimers particularly difficult to denature (**Figure 2A**, lane 3). Higher molecular mass forms were also detected, as well as a polypeptide of about 37 kD. When homogenization was performed in the presence of 2-ME, the 37 kD form was by far the major one detected, indicating that it represents the monomeric form, which remains strongly assembled into dimers and further polymers when the reducing agent is not present at the time of homogenization (**Figure 2A**, lane 4). An 18 kD polypeptide was also detected, probably derived by proteolytic cleavage occurring in the C-terminal region.

hBMP2nat migrated as a polypeptide of 45 kD, as expected, but it accumulated at much lower levels than the zein fusion and was soluble only in the presence of 2-ME (**Figure 2A**, lanes 5, 6). Neither zein-hBMP2ad nor hBMP2nat were detectable in the protoplast incubation medium (**Figure 2A**, lanes 7– 12), strongly suggesting that the two proteins were not secreted.

To verify whether a proportion of protein remains insoluble even when homogenized in the presence of 2-ME, after homogenization in oxidizing or reducing conditions samples where centrifuged and both soluble (S) and insoluble (I) fractions were denatured and analyzed by SDS-PAGE and protein blot (**Figure 2B**). Only a small amount of zein-hBMP2ad was insoluble, both in the presence and absence of 2-ME (**Figure 2B**, lanes 6, 8), confirming that this recombinant protein is almost completely soluble in aqueous buffer. A higher proportion of hBMP2nat (about 20%) remained insoluble after homogenization with the reducing agent (**Figure 2B**, compare lanes 11 and 12). When homogenization was performed without 2-ME, hBMP2nat was not recovered, neither in the soluble fraction nor as an insoluble precipitate (**Figure 2B**, lanes 9–10), indicating that tissue homogenization without the reducing agent could lead to the precipitation of insoluble aggregates of hBMP2nat that cannot be further denatured.

# In Transgenic Plants, zein-hBMP2ad Is Retained in the ER Mainly as Soluble Dimers and Accumulates at Higher Levels Compared to hBMP2nat

Since both zein-hBMP2ad and hBMP2nat accumulated during transient expression, albeit at different levels, we produced tobacco transgenic plants expressing the two constructs under the CaMV35S promoter. Leaf extract from several hygromycin resistant putative transgenic lines grown in axenic conditions for 4–6 weeks were analyzed by protein blot with anti-Flag antiserum. Accumulation of the recombinant proteins was variable, as it nearly always happens when different transgenic plants are analyzed; however, zein-hBMP2ad showed a clear tendency to accumulate at much higher levels than hBMP2nat (**Figure 3**, compare upper and bottom panels, and notice that the hBMP2nat protein blot was exposed for a 15 times longer time than the zein-hBMP2ad one).

The solubility grade of hBMP2nat and zein-hBMP2ad in transgenic plants was tested by homogenization in either oxidizing or reducing conditions. The banding pattern as well as the solubility of zein-hBMP2ad were very similar to those obtained upon transient expression (**Figure 4B**, lanes 3, 4, and compare with lanes 3 and 4 in **Figure 2A**), with most of the protein soluble in the absence of 2-ME. When hBMP2nat leaf homogenization was performed in the absence of reducing agent, a polypeptide of the expected 45 kD molecular mass was detected (**Figure 4A**, lane 3), as well as a larger form around 55 kD that could indicate heterogeneous N-glycosylation (Hang et al., 2014, and see below). However, in the presence of 2-ME, instead of these two main polypeptides several other forms were detected (**Figure 4A**, lane 4), that became also visible in transient expression by longer exposure of the blot in **Figure 2A** (see Figure S1, lanes 5, 6), suggesting the presence of intrachain disulfide bonds and/or partial proteolysis. Two of these polypeptides were also observed in extracts from untransformed plants (**Figure 4A**, lane 2), and represented endogenous tobacco proteins, visible because of the relatively high amount of total protein loaded to clearly visualize the hBMP2nat polypeptides. Notice that in **Figure 4** one tenth of the total amount of leaf protein was analyzed in **Figure 4B** compared to **Figure 4A**, because of the higher amount of zein-hBMP2ad accumulation with respect to hBMP2nat; for this reason the

FIGURE 2 | hBMP2nat and zein-hBMP2ad are soluble in transiently transfected protoplasts and are not secreted. (A,B) Tobacco leaf protoplasts were transiently transfected with plasmid encoding hBMP2nat (lanes 5–6, 11–12), zein-hBMP2ad (lanes 3–4, 9–10), or empty plasmid lanes (1–2, 7–8) and incubated for 24 h. (A) Protoplasts or incubation media were homogenized in the presence (+) or absence (−) of 2-ME. Aliquots corresponding to 20.000 protoplasts (cells) or the corresponding incubation medium (medium) were analyzed by SDS-PAGE in reducing conditions, followed by protein blot with anti-Flag antiserum. (B) Aliquots corresponding to 16.500 protoplasts were homogenized in the presence (+) or absence (−) of 2-ME. After centrifugation, soluble (S) material and insoluble precipitate (I) were analyzed by SDS-PAGE in reducing conditions, followed by protein blot with anti-Flag antiserum. Asterisk: 18 kD fragment of hBMP2ad. Numbers on the left indicate the positions of molecular mass markers, in kD.

endogenous tobacco polypeptides recognized by the anti-Flag antiserum were not visible in **Figure 4B**.

We reasoned that the zein fusion could be more promising for the production of recombinant hBMP2ad because of its higher accumulation levels compared with those of the native osteogenic factor. We therefore focussed on zein-hBMP2ad for subsequent experiments.

Previously published data showed that, in transgenic plants, full-length γ-zein or protein fusions containing its N-terminal domain were insoluble in the absence of reducing agent and were formed by very large disulfide-bonded polymers that migrated at the bottom of the tube under velocity centrifugation analysis (Bellucci et al., 2000; Mainieri et al., 2004; Torrent et al., 2009a; Virgili-López et al., 2013). Conversely, our results showed that zein-hBMP2ad was similarly soluble in oxidizing and reducing conditions. We therefore investigated its polymerization grade. Total leaf homogenate from zeinhBMP2ad line #7 (see **Figure 3**, bottom panel) was extracted in the absence of 2-ME and fractionated by velocity sucrose (5–25% w/v) gradient ultracentrifugation. Equal proportions of each fraction and of the pellet at the bottom of the tube (solubilized by SDS-PAGE denaturation buffer) were analyzed by protein blot with anti-Flag antiserum (**Figure 5**). zein-hBMP2ad migrated both as dimers (**Figure 5**, asterisk) and as multimers that fractionated at progressively distant positions along the gradient until the bottom of the tube, indicating the formation of very large polymers that were nevertheless still soluble. Only a very small proportion ended up at the bottom of the tube; this pellet could contain insoluble polymers. As the size of the polymers increased and exceeded the 231 kD marker, these could no longer be disassembled by the SDS-PAGE and migrated progressively more slowly. The larger ones actually remained at the interface between the stacking and separating gels (**Figure 5**, arrow).

The intracellular localization of zein-hBMP2ad was investigated by subcellular fractionation on isopycnic sucrose

density gradient. Because the binding of ribosomes to the ER membrane is dependent on Mg2+, the chelation of this cation with ethylenediaminetetraacetic acid (EDTA) disengages the ribosomes from the membrane, determining a density shift of the ER-derived microsomes, and not of other compartments, to lighter fractions of the gradient. Leaves from zein-hBMP2ad (line #23) were homogenized in the absence of detergent and in the presence of sucrose to maintain organelle integrity. The homogenate was loaded on two 16–65% (W/W) sucrose gradients containing Mg2+ or EDTA and subjected to centrifugation. Equal amount of each fraction were analyzed by SDS-PAGE followed by protein blot with either anti-Flag serum or sera against the ER markers BiP and endoplasmin (Grp94). In the presence of Mg2+, zein-hBMP2ad was mainly detected in four fractions with a peak around a density of 1.18–1.19 mg/mL, where the ER marker BiP and Grp94 also peaked (**Figure 6**, panels on the left). The presence of EDTA caused a shift of both zein-hBMP2ad, BiP and Grp94 to lower density fractions (peaks around 1.16–1.17 mg/mL), coherent with the density shift of the ER due to ribosome release (**Figure 6**, panels on the right). The ER marker BiP and, in minor proportion Grp94 and zein-hBMP2ad, were also present at the top of the gradient, as observed previously in similar experiments (Pedrazzini et al., 1997; Mainieri et al., 2004), possibly reflecting partial release from the ER lumen on homogenization.

Taken together, our results indicate that the 27 kD γ-zein N-terminal domain is sufficient to retain zein-hBMP2ad in the ER, but it does not allow the hyper-polymerization and insolubility that are typical of protein bodies formed by zein in maize endosperm and zeolin in transgenic tobacco.

#### zein-hBMP2ad is Glycosylated in Tobacco Transgenic Plants

Native hBMP2 contains five putative N-glycosylation sequons: four of them occur within the protein prosegment domain (positions N135, N163, N164, N200), and one in the active domain (position N338). It has been recently demonstrated that in both CHO and HEK293T cells, the positions N135, N200, and N338 are actually glycosylated (see **Figure 1**), suggesting that the N-glycosylation status of hBMP2 is independent of cell types and species (Hang et al., 2014). Moreover, the three Nglycans are sensitive to peptide-N-glycanase F (PNGase F, which removes both high-mannose and Golgi-modified, complex type N-glycans) and Endoglycosidase H (Endo H, which removes only high-mannose glycans), indicating that they are of the highmannose type. The N-terminal region of 27 kD γ-zein does not contain N-glycosylation sites.

To asses whether the N-glycosylation at position N338 of the active domain was maintained in zein-hBMP2ad transgenic plants, total leaf proteins form line #7 were extracted in the presence of 2-ME and incubated with or without Endo H. Treatment with the endoglycosidase caused an electrophoresis mobility shift, consistent with the removal of one high-mannose N-linked glycan (**Figure 7**, compare lane 1 and 2). Upon long exposure of the blot, the 18 kD fragment was detectable and was also sensitive to Endo H digestion (**Figure 7**, bottom

FIGURE 6 | zein-hBMP2ad cofractionates with the ER. Hundred milligrams of fresh leaves from zein-hBMP2ad line #23 were homogenized in sucrose buffer, in the absence of detergent. The homogenates were fractionated by centrifugation on isopycnic sucrose gradient (16–65% w/w) in the presence of MgCl2 (left panels) or EDTA (right panels). Collected fractions were analyzed by SDS-PAGE followed by immunoblotting with anti-Flag antiserum (upper panels) or with a mix of anti-BiP and anti-Grp94 antisera (bottom panels). T, total extract; P, pellet at the bottom of the tube. Numbers at top indicate the density (g/ml) of sucrose. Numbers at left indicate the position of molecular mass markers, in kD.

panel, compare lane 1 and 2). Taken together, the results indicate that in transgenic plants the N338 position of hBMP2ad is N-glycosylated and the 18 kD fragment is most probably the active domain that has been cleaved from the γ-zein moiety.

## The hBMP2 Active Domain Can Be Efficiently Recovered by Thrombin Cleavage

It has been previously observed that the accumulation levels of recombinant proteins in transgenic plants could depend on leaf age and size, with younger leaves showing better performance than the older ones (McCabe et al., 2008; Kolotilin et al., 2013). The size and position of leaves along the stem identify their age: older and larger leaves are positioned below newer ones. To analyze zein-hBMP2ad accumulation in relationship to leaf age, total proteins were extracted without 2-ME from leaves (line #7, see **Figure 3**) with different size (from the bottom to the apex). Younger leaves accumulate higher amounts of zein-hBMP2ad per mg of total protein (Figure S2, lanes 2, 9). In the light of these results, 3–4 cm long leaves from young plants were used to quantify the hBMP2ad produced by tobacco plants.

A preliminary quantification of the full-length zein-hBMP2ad was performed by densitometric analysis of the bands by comparing progressive dilutions of total leaf proteins from line #23 (see **Figure 3**), extracted in the presence or absence of 2-ME (**Figure 8**, upper and lower panels, respectively) with known amounts of the commercial standard protein Flag–BAP, (**Figure 8**, lanes on the left). The results indicated that full-length zein-hBMP2ad could represents about 1.1 and 1.25% of total soluble proteins extracted with or without 2-ME, respectively. Densitometric quantification of zein-hBMP2ad full-length from other independent lines showed that the accumulation levels reached the 1.75 and 2.25% in plants #7 and #27, respectively (Figure S3), indicating a certain grade of variability (see Figure S2).

To investigate the accessibility of the thrombin cleavage site to the protease and to quantify the yield of hBMP2ad, total leaf proteins were extracted in reducing conditions from wild-type or two different zein-hBMP2ad plants (#7 and #27) and subjected to digestion with thrombin or incubated in the absence of the enzyme for 20 h. Samples were analyzed by SDS-PAGE and protein blot with anti-Flag antiserum

at left indicate the position of molecular mass markers, in kD.

(**Figure 9**, lanes 6–11). Under mock treatment, zein-hBMP2ad monomers, dimers and polymers are observed (**Figure 9**, lanes 8, 10); the 18 kD proteolytic fragment was also present (**Figure 9**, lanes 8, 10, asterisks). The fusion protein zeinhBMP2ad was efficiently cleaved by thrombin and the hBMP2ad portion was mostly released as a 20 kD fragment (**Figure 9**, lanes 9, 11, empty circles), corresponding to the expected molecular mass of the N-glycosylated active domain. The exact efficiency of cleavage is not easy to establish, because of the smeared electrophoretic pattern of the undigested protein, but clearly the vast majority of polypeptides have been successfully cleaved. The 18 kD fragment did not change electrophoretic mobility upon thrombin treatment, indicating that its Nterminus is downstream the thrombin cleavage site and upstream the N-glycosylation site (**Figure 9**, asterisks, compare lanes 8–9 and 10–11). Quantification of the unpurified hBMP2ad fragment performed by comparison with known amounts of the commercial standard protein Flag–BAP (**Figure 9**, lanes 1–5) indicated that hBMP2ad represents around 0.02% of total soluble proteins extracted in the presence of 2-ME.

A more precise quantification of intact zein-hBMP2ad or released hBMP2ad will be necessary to identify the best expressor among the different lines. The quantification should be performed by ELISA technique using anti-hBMP2 antibodies.

#### DISCUSSION

#### zein-hBMP2ad Provides Information on the Requirements for PB Formation

Several studies have reported that the γ-zein N-terminal domain, can successfully lead to high accumulation via PB formation

when fused to the N- or C-terminus of other proteins introduced into the ER, such as bean phaseolin (Mainieri et al., 2004), HIVp24 (Virgili-López et al., 2013), xylanase (Llop-Tous et al., 2011), Moreover, this system can be used not only in plant tissues or cell cultures but also in fungal, insect, and mammalian cells (Torrent et al., 2009b). Previous results showed that zeolin, the chimeric protein obtained by attaching the γ-zein N-terminal domain at the C-terminus of phaseolin, which is a vacuolar seed storage protein, has all major features of zein: ER retention, PB formation and insolubility in absence of reducing agents (Mainieri et al., 2004). When this γ-zein domain was fused to the cytosolic viral protein Nef, it was unable to prevent degradation by the ER quality control (de Virgilio et al., 2008). However, Nef was stabilized and formed PBs when the entire zeolin sequence was fused to it. Altogether, these data suggest that specific structural characteristics of the non-zein portion may influence the fate of chimeric proteins containing the γ-zein Nterminal domain. Phaseolin, which like other storage proteins has a tendency to form large complexes, may contribute to promote protein packaging accelerating the process of PBs formation. Conversely, Nef is a cytosolic protein; when introduced into the oxidizing ER environment its cys residues could negatively affect correct polymerization by the zein domain, leading to rapid entry into the degradative pathway of ER quality control before PB formation could occur. The results presented here suggest that the seven cysteine residues of hBMP2ad do not have a destabilizing effect on zein-hBMP2ad and actually the chimera accumulates to higher amounts than native hBMP2, which is a protein naturally introduced into the ER. However zein-hBMP2ad remains soluble also in the absence of reducing agents and in large proportion it does not undergo the extensive, polymerization and insolubilization events that are typical of protein bodies formed by zein in natural maize seeds and by zeolin in transgenic plants: velocity gradient centrifugation showed that most of the chimera forms soluble dimers and oligomers below 231 kD. This is perhaps an unexpected result, and constitutes a case in which the zein domain fails to promote insolubilization but, nevertheless, leads to retention and stabilization of a protein introduced into the ER. The Cterminal region of the real 27 kD γ-zein (the 2S albumin-like domain, Mainieri et al., 2014) contains six cysteines paired in three intra-chain disulphide bonds. hBMP2ad has one more cysteine that is engaged in the dimer formation. The position of this inter-chain disulphide bridge may disturb the close packing that leads to insolubility. It should also be noticed that we did not detect secretion of hBMP2nat upon transient expression. This unexpected result may indicate that anyway, at least in plant cells, this protein does not efficiently enter intracellular traffic along the secretory pathway and, in the absence of the zein stabilizing domain may be degraded by ER quality control.

Subcellular fractionation indicates that zein-hBMP2ad accumulates in the ER. Its single N-linked glycan can be removed by endoglycosidase H, indicating that it is not processed by Golgi-enzymes; itself, this is not a demonstration of lack of trafficking along the secretory pathway, because protein folding can inhibit access of Golgi glycosyltransferases, but it is certainly consistent with an ER localization. Indeed, it has been recently demonstrated that in CHO cells all the three hBMP2 N-glycans are EndoH sensitive, even if the protein traffics through the Golgi complex and is secreted (Hang et al., 2014). The N-glycan of hBMP2ad might also increase solubility of the chimeric protein, preventing protein body formation. Phaseolin present in zeolin also has one N-linked glycan, but this clearly does not impair the formation of insoluble protein bodies (Mainieri et al., 2004). If the glycan of hBMP2ad plays a role in preventing insolubility, one possible explanation for this discrepancy could reside in the fact that the phaseolin glycan is exposed on the protein surface, as demonstrated by its accessibility to Golgi enzymes (Sturm et al., 1987), whereas that of hBMP2ad is probably masked by interactions with the polypeptide chain, since it does not become Golgi-modified in vivo. Such different interactions between N-glycans and the polypeptide chains can exert different influence on the ability of the chimeric proteins to assemble into protein bodies.

The question therefore arises of how zein-hBMP2ad is retained in the ER, in spite of not having a H/KDEL signal for ER retention and failing to form insoluble large polymers. Pioneering work on the mechanism of 27 kD γ-zein retention in the ER showed that the wild-type protein was able to form PB in the ER of vegetative plant tissues (Shewry et al., 1995), but neither of the two domains, each corresponding approximately to half of the protein, could (Geli et al., 1994). The 27 kD γ-zein Cterminal domain, engineered to enter the ER by the addition of a signal peptide, was secreted, while the N-terminal domain, containing its own signal peptide and all the seven cysteine residues, was soluble also in the absence of reducing agent and accumulated in the ER as diffuse, slightly electron-dense material, very different from the well-defined, round shaped and highly electron dense PB formed by the entire 27 kD γzein polypeptide (Geli et al., 1994). The ER retention of the N-terminal domain was hypothesized to occur because of the repeated PPPVHL hexapeptide. It was later shown (Kogan et al., 2002) that a synthetic version of the eight PPPVHL repeats forms an amphypatic helix and interacts with liposomes in vitro. It is therefore possible that the addition of the hBPM2ad sequence to the zein domain has a sort of "neutral" effect: it does not alter folding and therefore it does not lead to ER quality control degradation but at the same time it does not allow extensive interactions among the Cys residues of the N-terminal zein domain that are necessary for the formation of the insoluble PB. Therefore, zein-BMP2ad would behave like the γ-zein Nterminal domain expressed alone, consistent with the hypothesis that the structural characteristics of the C-terminal domain of this prolamin also play a role in PB formation.

## Cost-Benefit in Using γ zein-Fusion for the Production of hBMP2ad in Transgenic Plants

Actually, rhBMP2 for clinical purpose were produced from Medtronic in CHO DHFR-deficient cell lines via methotrexatemediated gene amplification (Israel et al., 1992). The consumption of time and the high costs of the purification steps, added to the low yield of the final product (ng/ml) make more difficult the widespread use of recombinant rhBMP2 in clinical applications (Lee et al., 2010; Luan et al., 2011). After the FDA approval, many attempts have been made to produce rhBMP2 in the less expensive bacterial systems, as an alternative to mammalian cells (Vallejo et al., 2002; Park et al., 2014). However, rhBMP2 is a secretory protein that needs the oxidizing environment of the eukaryotic ER lumen to fold and assemble correctly; moreover, rhBMP2 produced in E. coli does not undergo N-glycosylation, a protein modification that occurs cotranslationally in the ER lumen. Therefore, the biological activity of E. coli-derived rhBMP2 has to be restored by in vitro refolding after purification, adding some complexity to the production method (Kübler et al., 1998; Bessho et al., 2000). Despite a 110 mg/L production rate of E. coli-derived rhBMP2, its activity and efficacy after refolding is controversial (Bessho et al., 2000; Bessa et al., 2008; Yano et al., 2009; Lee et al., 2010). Recently, same preclinical studies showed promising results in using E. coli-produced rhBMP2 for the regeneration of experimentally induced bone defects (Harada et al., 2012; Ono et al., 2013; Chung et al., 2015; You et al., 2016), even if the most effective dose as well as the optimal carrier are still to be determined. A comparison between CHO- and E. coli-derived rhBMP2 using a dog critical-size supraalveolar peri-implant model showed that the two rhBMP2s are equally effective in inducing bone formation (Lee et al., 2013); however, no economic analysis was provided in this study. Up to now, a conclusive study on the costs-benefits, based on the comparison between the different methods of production of BMP-2 actually on the market, is not yet available.

In a previous study, Suo et al. (2006) reported the production of human BMP-2 active domain at level 0.02% TSP in tobacco plants. It is not easy to make a comparison of this production efficiency and our results, because the constructs used for tobacco transformation are different. In Suo et al. (2006), the maximum yield of hBMP2ad was obtained from plants transformed with a constructs containing a combination of double 35SCaMV promoter, AMV enhancer, and two Rb7 MARs sequences that promotes transgene expression. Plants transformed with the construct containing a single 35SCaMV promoter, without the MARS sequences, had about five times lower expression level (measured by GUS activity). Using a single 35SCaMV promoter, we reached the yield of about 0.02% TSP, which could be further improved by doubling the promoter or by addition of enhancer sequences. It should also be underlined that Suo et al. expressed hBMP2ad without the signal peptide; the osteogenic factor is therefore most likely accumulated in the cytosol, a non-optimal environment for a secretory protein.

A recent randomized clinical trial using E. coli-derived rhBMP2, reported the efficacy of 0.5–2 mg of 1 mg/ml rhBMP-2 in bone formation after maxillary sinus augmentation (Kim et al., 2015). Obviously, the total dose depends on the defect size (Boyne et al., 2005; Fiorellini et al., 2005; Herford and Boyne, 2008; Triplett et al., 2009). Once the initial cost of the F0 transgenic production has been overcome, plant-based systems do not require the same expensive investments as other production methods. From our best producing plant we recovered about six micrograms of hBMP2ad/g FW: a rough

#### REFERENCES


estimate suggests that one transgenic plant may be sufficient for a single clinical application. Only a comparative study between hBMP2 produced in plant, mammalian, or bacterial systems, will allow to make more precise estimates about costeffectiveness, and to establish whether transgenic plants could really represent a valuable alternative to the currently available products.

#### AUTHOR CONTRIBUTIONS

EP supervised the entire work, designed all experiments and wrote the paper. DM contributed in designing the experiments. VC performed the experiments and contributed in writing the paper. MD contributed in writing the paper. RW obtained grant support.

#### ACKNOWLEDGMENTS

We are very grateful to Alessandro Vitale for the useful discussions and suggestions. This work was supported by Programs "Risorse biologiche e tecnologie innovative per lo sviluppo sostenibile del sistema agroalimentare" and "Filagro" of CNR-Regione Lombardia. CV was supported by the Italian Ministry of Education, Universities and Research PhD Grant.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00358

and future perspectives. J. Dent. Res. 93, 335–345. doi: 10.1177/00220345135 18561


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Ceresoli, Mainieri, Del Fabbro, Weinstein and Pedrazzini. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Molecular Farming in Artemisia annua, a Promising Approach to Improve Anti-malarial Drug Production

Giuseppe Pulice<sup>1</sup> , Soraya Pelaz2,3 and Luis Matías-Hernández1,2 \*

<sup>1</sup> Sequentia Biotech, Parc Científic de Barcelona, Barcelona, Spain, <sup>2</sup> Plant Development and Signal Transduction Department, Centre for Research in Agricultural Genomics, Barcelona, Spain, <sup>3</sup> Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain

#### Edited by:

Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### Reviewed by:

Tauqeer Hussain Mallhi, Hospital University Sains Malaysia, Malaysia Wolfgang Eisenreich, Technische Universität München, Germany

> \*Correspondence: Luis Matías-Hernández

lmatias@sequentiabiotech.com

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 14 October 2015 Accepted: 03 March 2016 Published: 18 March 2016

#### Citation:

Pulice G, Pelaz S and Matías-Hernández L (2016) Molecular Farming in Artemisia annua, a Promising Approach to Improve Anti-malarial Drug Production. Front. Plant Sci. 7:329. doi: 10.3389/fpls.2016.00329 Malaria is a parasite infection affecting millions of people worldwide. Even though progress has been made in prevention and treatment of the disease; an estimated 214 million cases of malaria occurred in 2015, resulting in 438,000 estimated deaths; most of them occurring in Africa among children under the age of five. This article aims to review the epidemiology, future risk factors and current treatments of malaria, with particular focus on the promising potential of molecular farming that uses metabolic engineering in plants as an effective anti-malarial solution. Malaria represents an example of how a health problem may, on one hand, influence the proper development of a country, due to its burden of the disease. On the other hand, it constitutes an opportunity for lucrative business of diverse stakeholders. In contrast, plant biofarming is proposed here as a sustainable, promising, alternative for the production, not only of natural herbal repellents for malaria prevention but also for the production of sustainable antimalarial drugs, like artemisinin (AN), used for primary parasite infection treatments. AN, a sesquiterpene lactone, is a natural anti-malarial compound that can be found in Artemisia annua. However, the low concentration of AN in the plant makes this molecule relatively expensive and difficult to produce in order to meet the current worldwide demand of Artemisinin Combination Therapies (ACTs), especially for economically disadvantaged people in developing countries. The biosynthetic pathway of AN, a process that takes place only in glandular secretory trichomes of A. annua, is relatively well elucidated. Significant efforts have been made using plant genetic engineering to increase production of this compound. These include diverse genetic manipulation approaches, such as studies on diverse transcription factors which have been shown to regulate the AN genetic pathway and other biological processes. Results look promising; however, further efforts should be addressed toward optimization of the most costeffective biofarming approaches for synthesis and production of medicines against the malaria parasite.

Keywords: biofarming, malaria resistance, Artemisia annua, artemisinin, transcription factors, hormones, genetic engineering

# INTRODUCTION

fpls-07-00329 March 16, 2016 Time: 15:17 # 2

Malaria is a parasite infection that still affects millions of people worldwide. According to the annual World Health Organization (WHO) report (WHO, 2014b, Malaria World Report), about 90% of all malaria deaths occur in Africa, mostly among children under the age of five. Therefore, malaria has been listed among the most significant causes of death worldwide (WHO, 2014b; Malaria World Report). Malaria is a protozoan disease, transmitted by mosquitoes of the genus Anopheles. Among the four species of the Plasmodium genus that provoke malarial infections in humans, most cases relate to either Plasmodium vivax or P. falciparum; the latter being the most common and responsible for almost all of the deaths (White et al., 2014). Despite malaria being eradicated from the USA, Canada, Europe, and Russia, its incidence increased, especially in tropical countries, from the '1970s through the '1990s of the last century. Since then, new progresses in terms of prevention and treatment have been developed, in an attempt to control and eliminate the infection. However the number of affected people, and deaths, still remain high, and the disease is currently transmitted in 108 countries worldwide (Feachem et al., 2010; Alonso et al., 2011). There are three major reasons for the high persistency of malaria. First, the onset of resistance to anti-malarial drugs: the Plasmodium parasite developed resistance to different treatments, especially when only a single drug, quinine, was being administered (White and Olliaro, 1996). This evidence pushed toward the search for new treatments, while WHO (2008) suggested the use of "combination therapies" to treat malaria. The discovery of AN, an anti-malarial compound found in A. annua, and its use in the "combination of anti-malarial treatments" (ACT), has guaranteed a very powerful and efficacious ACT. Indeed, the Nobel Prize in Medicine 2015 has been recently awarded to, among others, Professor Youyou Tu for her discovery of this anti-malarial compound. Despite this, two main issues related to ACT still remain unresolved. Production cannot cover the increasing demand in countries where the disease is endemic, while the cost of these drugs is very high for the people who need it the most. Plant molecular farming combines agriculture and metabolic pathway engineering, exploiting the plant's natural biochemical pathways. In this context plant biofarming would constitute a better cost-effective and valuable system for producing huge amounts of AN, and thereby improving the social-economic conditions of malaria-affected areas.

A second cause of malaria's persistency is climate change and migration. These phenomena interact with environmental factors, causing an increase in the distribution and impact of malaria in endemic countries and an emergence in nonendemic ones (Lindsay and Thomas, 2001; Reiter et al., 2004; Lindsay et al., 2010; Caminade et al., 2014; Roiz et al., 2014). During the early part of the 21st century, invasive mosquitoes became widely established across Europe and, for example, malaria reappeared in Greece (Danis et al., 2011, 2013). In addition, pyrimethamin-resistant parasites moved from Southeast Asia, spreading resistance alleles across Africa (Roper et al., 2004). Therefore, analyzing the socio-environmental changes and monitoring migration have become critical tools for keeping the malaria alert alive, even in countries where the presence of this, and other, airborne diseases is absent or almost undetectable.

Finally, a relaxation by health authorities in terms of controlling the spread of the disease has been observed in recent years. In areas where infection is unstable or confined to a period of a few months, a burst of infection may occur – due to climatic or social changes and mixed with a lack of prevention and health care. Consequently, this symptomatic disease, which can occur at all ages, could possibly cause epidemics. As a result, untreated and improperly treated malaria cases may lead to excessive malaria mortality and morbidity.

This review will offer a panoramic view of the onset of drug resistance, which constitutes a major threat against malaria treatment and, hence, its eradication. Additionally, attention will be paid to evaluating and elucidating the contribution and impact of different A. annua biofarming approaches, for improving drug production and decreasing its price. This evaluation will include not only promising results but also weaknesses and technological gaps, in an attempt to optimise these approaches and improve the living and working conditions of the inhabitants of the affected areas.

#### MALARIA LIFE CYCLE, PREVENTION AND TREATMENT

The stages of the malaria parasites' life cycle have been recomposed, like a puzzle, through the time, considering that findings have not proceeded in a linear way. This complex history is accurately described in a review by Cox (2010). Updated information about the malarial cycle and disease was taken from White et al. (2014) and is summarized in **Figure 1**. Female Anopheles mosquitoes are responsible for the transmission of malaria because they feed on blood (mostly at night), while males feed on plant nectar. A mosquito that is hosting the Plasmodium transmits malaria by the inoculation of motile sporozoites into the blood of a vertebrate host, and the disease is provoked by the consequences of red-cell parasitisation and destruction. Severe malaria is provoked by massive sequestration and destruction of red blood cells, then finally affects vital organs and causes death. For the purpose of this review, we have summarized all the data in a schematic description (**Figure 1**).

The development of a malaria vaccine has been a very challenging task, especially because of the nature and evolution of the Plasmodium infection. In 2015, the European Medicines Agency's Committee for Medicinal Products for Human Use (EMA-CHMP) expressed, for the first time, a positive scientific judgment in favour of a potential anti-malarial vaccine, the RTS,S/AS01. Its benefits outweigh the risks in the age groups examined and this vaccine may be used in high-transmission

areas in which mortality is very high. Unfortunately, the vaccine's efficacy is limited and it does not offer complete protection (WHO's Initiative for Vaccine Research; Agnandji et al., 2011; RTS,S and Clinical Trials Partnership, 2012; European Medicines Agency press release, 2015). Therefore, complementary and sustainable strategies are still of paramount importance in reducing the incidence of malaria. These approaches include a combination of physical (mosquito nets) and chemical/biological (repellent oils) measures, as well as increased access to A. annua for first-line treatments against the disease. Unfortunately, prevention does not suffice for eradication of malaria and some efficient treatments have also been successfully used to cure severe cases of malaria in recent decades. However, two main issues still exist: the high cost of antimalarial drugs, which the majority of the population in developing countries cannot afford to pay, and the increasing drug-resistance that the Plasmodium parasite has developed (Verdrager, 1995; White and Olliaro, 1996). Uncontrolled drug distribution and use, and phenomena such as migratory events and climate changes, have contributed to the development of drug resistance. This is the reason why WHO first recommended the use of a combination of anti-malarial treatments as ACT, based on AN, (WHO, 2008) as an attempt to avoid, or at least reduce, parasite resistance. Unfortunately, improper, widespread use or incorrect prescription management may increase the insurgence of disease resistance (White, 2004; Gbotosho et al., 2009), having a devastating effect on worldwide malaria control. ACT was restricted only to the most difficult cases of malaria but, in 2010, the WHO changed its policy and authorized the use of ACT as first-line treatment at a global level. Major concern

was provoked by data indicating that AN resistance had already emerged in small areas of Cambodia, Thailand, Vietnam, and Myanmar (Denis et al., 2006; Noedl et al., 2008; Dondorp et al., 2009; Rogers et al., 2009; Carrara et al., 2013; Leang et al., 2013; Saunders et al., 2014; Tun et al., 2015). This Southern Eastern Asia area may be a possible route for the spread of resistance to the Indian subcontinent (Gething et al., 2011); the same path as that followed by chloroquine in the past (Wellems et al., 2009; Ashley et al., 2014). This evidence raised the level of alarm and the WHO quickly started a campaign in the areas hit by AN resistance, attempting to control the spread and understand how far it had reached. So far, the spread has not affected Africa (Amaratunga et al., 2012; Lopera-Mesa et al., 2013; Ashley et al., 2014; Mok et al., 2015). Fortunately, real-time detection and monitoring of the distribution of drug-resistant malaria parasites can help to prevent the spread (Miotto et al., 2014; WHO, 2014a, Status report on artemisinin resistance, Menard and Ariey, 2015; Tun et al., 2015).

#### FROM TRADITIONAL MEDICINE TOWARD BIOFARMING

Malaria inflicts a huge economic burden on individuals and entire communities in developing countries. As a consequence, the high prevalence of malaria in the poorest countries should be a global health priority for the foreseeable future (Sachs and Malaney, 2002; Chima et al., 2003; WHO, 2014b, Malaria World Report).

Malaria prevention is one of the most cost-effective interventions available (White et al., 2011). Indeed, the costeffectiveness of the different malaria treatments has improved significantly in recent years, using measures for prevention such as the aforementioned mosquito nets and repellents. But this also might be improved upon by the introduction of biofarming and exploitation of plant-based-drugs and different traditional medicines. Indeed, the use of traditional medicine in tandem with modern medicine has been identified as one of the main factors that could explain the significant improvements of health and social indicators in several developing countries (Simpson, 1988; Waxler-Morrison, 1988; Follér, 1989; Montenegro and Stephens, 2006).

Historically, local pharmacopeia based on native medicinal plants had been adopted by human beings even before society was created (Fernandez, 2006). Since then, pharmacopeia has passed through its own evolution process, starting with ethno-botanical local and home-made medicines, derived from basic herbal extract compounds, followed by chemically synthesized pharmaceutical molecules, and reaching nowadays plant biofarming processes (**Figure 2**). Plant molecular farming combines metabolic pathway engineering and agriculture, in order to use plants as factories and produce valuable products such as recombinant proteins, vaccines or pharmacological molecules (Thomson, 2008; Rybicki, 2014). Indeed, plant biofarming represents an intriguing alternative to microbial and mammalian cell bioreactors. The use of plants instead of microbial and animal cells can significantly reduce the production costs (Webster, 2004). Moreover, most of the molecules produced in plants can be safely stored for long periods without refrigeration, if they are expressed in seeds or leaves that can be stored dried (Ahmad et al., 2012). Therefore, plant biofarming represents an unprecedented opportunity to manufacture affordable modern medicines and make them available at a global scale, particularly in underdeveloped countries where access to medicines and vaccines has historically been limited (Murphy, 2007). Consequently there is an increasing interest in the application of plant biofarming to producing indigenous plant-derived medicines, as well as in the identification of unique medicinal plants and the discovery of new pharmacological active compounds. Traditional plant-based medicines that have been genetically improved for prevention and treatment of malaria represent an appropriate example of biofarming and will be described below.

#### MALARIA PREVENTION AND TREATMENT USING PLANT-BASED MEDICINES

Hundreds of plants have been identified around the world as potential repellents against diverse types of mosquito. Some of these natural herbal repellents may prevent the bite of Anopheles, which is the transmission vector of the malaria parasite (Gupta and Rutledge, 1994). Indeed, research analysis conducted in the past revealed that, among different plants, those that were most used as Anopheles mosquito repellents were Neem (Azadirachta indica), Ocinum gratissinum, Ocinum suave, Eucalyptus camaldulensis, Lantana camara, and Lippia uckambensis (Seyoum et al., 2002a,b; Dugassa et al., 2009; Kebede et al., 2010), as reported in **Table 1**.

Due to their easy processing, efficiency and reduced cost, these indigenous herbal repellents, if properly used, could become a useful and sustainable approach for reducing malaria-related infections and deaths (Mishra et al., 1995; Ansari and Razdan, 1996; Okumu et al., 2007). Despite this, molecular biofarming has not yet been applied to any of these species. It could be a useful tool for increasing the repellent content of the plant and, thereby, optimize the efficacy of its anti-malarial properties. Consequently,

future efforts may focus on the potential of molecular farming for improving the repellent activity of these plants.

Unfortunately, as prevention alone is not enough for the eradication of malaria, treatment has become a crucial approach for prevention of death. Some efficient treatments have been used in recent decades; however, in most of the countries where malaria existence was reported, the malaria parasite developed resistance to quinine (White and Olliaro, 1996). Quinine is a substance isolated from the peruvian trees Cinchona calisaya and Cinchona succirubra and was used for almost four centuries as the main drug for malaria treatment. Plantderived products keep making huge contributions toward the fight against malaria, either as known, direct, anti-malarial agents or as potential, and more efficient, novel anti-malarial compounds (Kumar et al., 2009). Indeed, not only the Cinchona tree but other plants species have also been found to have pharmacological properties against the malaria parasite (**Table 1**). This is the case of an herbal remedy based on three plants: Cochlospermum planchonii, Phyllanthus amarus, and Cassia alata (Kaushik et al., 2015; Lamien-Meda et al., 2015). C. planchonii roots alleviate malarial symptoms, while P. amarus and C. alata leaves and aerial tissues have antimalarial activity. Moreover, the active compounds of these three plant species are able to act synergistically as a proper anti-malarial phyto-medicine (Kaushik et al., 2015; Lamien-Meda et al., 2015). Similar results have been found within the diverse range of indigenous Amazonian plants so far studied: Aspidosperma rigidum, Ampelozizyphus amazonicus, Bertholletia excels, and Simaba cedron have all been found to contain the most active anti-malarial extracts among the amazon plants tested to date (Frausin et al., 2015; Oliveira et al., 2015) (**Table 1**). Interestingly, it is not only species in the Asiatic and Amazonian sub-areas, but Sub-Saharan Africa's enormous plant biodiversity is also proving to be a source of new anti-malarial phyto-remedies. Useful chemical compounds, with anti-plasmodial activity, efficacy and safety

TABLE 1 | Table showing the different medicinal plants used for preventing and treating malaria all along the history.


have been found in endemic plants growing in this area, including Adansonia digitata, Azadirachta indica, Ficus sur, Cassia occidentalis, Cassia siamea, Nauclea latifolia, Plumbago Zeylanica, Tithonia diversifolia, Turraea robusta, Turraea nilotica, and Vernonia amygdalina (Chinsembu, 2015; Irungu et al., 2015). However, excluding quinine, due to the insurgence of resistance to it, and all the potential therapeutic plants already described, nowadays the most widely used and efficient ACTs are those combining natural AN or chemically AN-synthesized derivates, such as artesunate and artemether (Ajayi et al., 2008; WHO, 2014b).

### ARTEMISIA ANNUA PLANT FOR TREATING MALARIA

Artemisinin is an anti-malarial compound that can only be found naturally in A. annua. Knowledge of the medicinal properties of this plant dates back to the year 168 B.C., when it was first used as a medicinal tea infusion to treat intermittent fevers (De Ridder et al., 2008). Since then, A. annua has been used in traditional Chinese medicine to treat malaria and other diseases (Heide, 2006). Due to its unique mode of action, AN is effective against the asexual stage of the malaria parasite's life cycle (Fidock, 2010). Interestingly, AN is a potential therapeutic agent not only against this parasitic disease but also against viral diseases, the treatment of certain cancers and the reduction of angiogenesis (Efferth et al., 2002; Singh and Lai, 2004; Romero et al., 2005). AN has proven cytotoxic effects against different types of cancer cells, such as breast, colon, renal, ovarian, prostate, central nervous system, leukemia and melanoma cancer cells (Efferth et al., 2002; Ho et al., 2014; Tang et al., 2015). The drug uses diverse mechanisms, such as inducing cell cycle arrest, promoting apoptosis, triggering cancer invasion and metastasis, and preventing angiogenesis, in order to function (Ho et al., 2014).

Chemically, AN is a sesquiterpene lactone compound, that is produced and stored exclusively in A. annua trichomes; which are small, isolated, epidermal protuberances on the surfaces of leaves and represent the aerial organs of most vascular plants (Olsson et al., 2009). Trichomes are involved in defending the plants against insect herbivores, viruses, UV light and/or excessive water loss (Traw and Bergelson, 2003). There are several different kinds of trichomes, but they are mainly classified into nonglandular and glandular (Traw and Bergelson, 2003). In A. annua, non-glandular trichomes are involved in water absorption, UVlight reflection and seed dispersal. Glandular trichomes have the morphological peculiarity to synthesize, store, and secrete large amounts of specialized and sometimes toxic secondary metabolites, including AN, that protect the plant from predators without interfering with normal plant growth (Wu et al., 2010; Lange and Ahkami, 2013). Glandular secretory trichomes from A. annua are formed from ten cells in five pairs; two basal cells, two stalk cells, four sub-apical cells and two apical cells. AN synthesis takes place in the sub-apical and apical cells, while its accumulation is localized in the sub-cuticular space of the trichomes (Duke et al., 1994; Ferreira et al., 1995; Olsson et al.,

2009). A. annua is a plant that can be readily grown in many environments. However, the AN content extracted from fresh and/or dry leaves is extremely low (0.1–10 mg/g dry weight), as it is only produced in the trichomes. On the other hand, despite the modernization of different techniques, to chemically synthesize the molecule makes the price too high for a significant number of malarial victims; especially those in developing countries where the malarial burden and impact are the greatest (Abdin et al., 2003; Zeng et al., 2008; WHO, 2014a). In the last ten years, the amount of ACTs produced and provided has increased 36-fold (WHO, 2014b). Unfortunately, production still cannot cover the increasing demand of artemisinin-based therapies in endemic countries. Therefore it is critical to improve the AN yield in planta, and develop better methods for its production. In order to realize this aim, biofarming has become an essential tool for increasing the worldwide supply of AN in the last decade.

# ARTEMISIA ANNUA BIOFARMING APPROACHES USING METABOLIC ENGINEERING

A prerequisite for the success of any secondary metabolite production using metabolic engineering is a deep understanding of its synthesis at the genetic level. For this reason, biochemical and molecular biological studies have been able to elucidate the complete biosynthetic pathway of AN in A. annua; as schematically reported in **Figure 3**. Genes encoding components of this pathway are specifically expressed in the A. annua trichomes located on leaves, floral buds, and flowers (Olsson et al., 2009). Two molecules of isopentenyl diphosphate (IDP) and one molecule of dimethylallyl diphosphate (DMADP) are condensed by farnesyl diphosphate synthase (FDS) to obtain farnesyl diphosphate (FDP). FDP, which generally serves as a precursor for sesquiterpenes including AN, is then converted into amorpha-4,11-diene through the activity of amorpha-4,11- diene synthase (ADS), and this is the first step of AN biosynthesis proper (Bouwmeester et al., 1999; Mercke et al., 2000). Amorpha-4,11-diene is then oxidized in three steps to artemisinic acid, through the action of amorpha-4,11-diene 12-hydroxylase (CYP71AV1), and a single cytochrome P450 monooxygenase (Ro et al., 2006; Teoh et al., 2006). Recent reports have shown that a double bond reductase (DBR2) and an aldehyde dehydrogenase (ALDH1) operate in the conversion of artemisinic aldehyde to its dihydro form, and then into the direct AN precursor dihydroartemisinic acid, respectively (Zhang et al., 2008; Teoh et al., 2009). The final production step is considered the result of a non-enzymatic, photo-oxidation reaction (Sy and Brown, 2001; Covello, 2008; Brown, 2010). However, recent results have proposed that a peroxidase enzyme or an alternative series of oxidations (occuring exclusively in planta) may in fact catalyze the crucial last reaction that converts the precursor into the valuable AN molecule (Bryant et al., 2015).

Over the decades, significant efforts have been made to increase AN production and reduce costs. Some progress, using diverse biofarming approaches, has been made in terms of increasing the production of this compound. This review will analyze the approaches that have been used in great detail; considering which were more or less successful, and identifying their strengths and weaknesses.

# Genetically Modified Fast Growing Organisms

Nowadays, AN derivates produced through chemical synthesis provide the basis for the most efficient ACTs treatments. Despite this, chemical synthesis of AN is not economically feasible because of the complexity and low yield of the process, in addition to the high prices for the people in need. Therefore, the use of genetically modified, fast-growing organisms, such as genetic engineered Escherichia coli and Saccharomyces cerevisiae, have arisen as a real alternative to chemical synthesis (Lindahl et al., 2006; Zeng et al., 2008). These organisms represent the most widely used heterologous hosts for the expression of enzymes and reconstitution of natural plant product biosynthetic pathways, as has been previously demonstrated for curcumin and piceatannol production (Zhang et al., 2015a).

To further increase cost-effective AN production, metabolic engineering strategies were used, overexpressing AN synthesis enzymes in these microorganisms (**Figure 3**). Cloning and transfer of the ADS enzyme in Saccharomyces cerevisiae allowed for the production of the AN precursor amorpha-4,11-diene in yeast (Lindahl et al., 2006). But the most successful strategy using an S. cerevisiae bioengineering approach was accomplished by combining different enzymatic steps (Ro et al., 2006). These steps included the cloning of the farnesyl pyrophosphate (FPP) biosynthetic pathway to increase FPP production, which is the immediate precursor before entering the AN pathway proper. This step was followed by reconstitution of the AN enzymes (mainly ADS and CYPP450) in these FPP high-producer yeasts. Even though a large amount of arteminic acid was produced, the final desired product, AN, was not synthesized. Although the AN biosynthetic pathway has been well investigated and great progress has been made in terms of cloning biosynthetic enzymes, the last step of this peculiar synthesis is not yet completely understood. Due to these previous results, it could be suggested that the last step is probably a typical plant non-enzymatic reaction that cannot be inserted into fast-growing organisms. Therefore, in yeast, AN biosynthesis reaches only production of the precursor, artemisinic acid, which afterward needs to be chemically converted into AN (Ro et al., 2006; Paddon et al., 2013). The large amounts of artemisinic acid that have been produced in yeast could be transformed into AN through semisynthesis and later purification processes, but with a consequent increase in costs.

In addition to yeast manipulation, genetic engineering using Escherichia coli was used as an alternative for AN production in microorganisms. Indeed, the introduction of the mevalonate pathway from S. cerevisiae into E. coli led to efficient production of terpenoid precursors (Martin et al., 2003). Consequently, E. coli might be used as a reliable system for the industrial production of plant sesquiterpenes, considering that some results obtained in the last decade have confirmed the potential of

this tool. When a few mevalonate enzymes were heterologously expressed together with the ADS enzyme, this confirmed that amorpha-4,11-diene was produced at high levels in E. coli (Martin et al., 2003; Tsuruta et al., 2009). However, as was observed with S. cerevisiae, no AN was produced at all, most probably due to the fact that the final step in AN biosynthesis is a plant-only, naturally occurring, reaction.

Considering all the results obtained from production of AN, using microbial hosts such as S. cerevisiae and E. coli, it can be concluded that further optimization is required to achieve the optimal yield for industrial production. Despite production of AN precursors in these organisms being higher than in wild-type A. annua, expensive semi-synthesis and purification processes are needed afterward. Consequently, it was worth exploring more biofarming approaches, in order to improve the cost-effectiveness of AN synthesis and thus reduce the current price of ACT (Ro et al., 2006).

#### Nicotiana tabacum Biofarming

In recent years, special attention has been paid to the fact that the last step of AN synthesis is not yet completely understood. As stated earlier, it was hypothesized that it could be a non-enzymatic photo-oxidation reaction (Sy and Brown, 2001; Covello, 2008; Brown, 2010). However, recent results have proposed that a peroxidase enzyme, or an alternative series of oxidations that occur exclusively in planta, may catalyze this last reaction in the process that produces the valuable AN molecule (Bryant et al., 2015). As a consequence of the partial failure using

microbial hosts, and considering that this last step is likely to be restricted to plants, a biofarming approach, using Nicotiana tabacum, was introduced to overcome this limitation (**Figure 3**).

Nicotiana tabacum, or the tobacco plant, is characterized by fast growth and high biomass production. Therefore it has been already established as a model plant for molecular farming (Kumar et al., 2012). Indeed, isoprenoid metabolic engineering has already been accomplished in tobacco (Kanagarajan et al., 2012; Kumar et al., 2012); strongly suggesting that efficient AN production could be achieved in this way. When heterologous coexpression of the AN biosynthetic pathway enzymes was carried out in tobacco, a clear accumulation of dihydroartemisininc alcohol, but not of artemisinic acid -the precursor to AN was detected (Zhang et al., 2011). This result may be explained by the cellular environment of the transformed tobacco plant. This different cellular environment, in contrast to A. annua, may favor biochemical reactions toward the reduction of AN intermediates to alcohols instead of oxidation toward acids (Zhang et al., 2011). Due to this unexpected limitation, further studies taking into account: the nature of the target AN (a unique sesquiterpene lactone); the variability of the possible precursors; the presence of different key genes for regulation; and the cellular environment of the plant used for biofarming should be taken into consideration (Majdi et al., 2015). Current approaches in this field have unfortunately obtained AN yields in tobacco that are a thousand times lower than those obtained in A. annua (Paddon et al., 2013). Additionally, chemical synthetic conversion of the AN precursors produced in tobacco is still needed for the last reaction. Therefore, further research in this field is essential, in order to achieve proper cost-effectiveness of AN production.

#### Endogenous and Exogenous Factors that Induce AN Production

Several factors also produced by A. annua have been found to positively affect AN synthesis (**Figure 3**). Among these endogenous factors, phytohormones play an essential role. Some plant hormones, such as abscisic acid (ABA), gibberellins (GA), salicylic acid (SA), and jasmonic acid (JA) have been described as positively affecting both trichome proliferation and AN biosynthesis in A. annua (Zhang et al., 2005, 2015b; Pu et al., 2009; Yu et al., 2011) (**Figure 4**). JA regulates secondary metabolism in several plant species (van der Fits and Memelink, 2000; De Boer et al., 2011) and, as expected, exogenous treatment with JA also stimulates AN production in A. annua, as well as formation of glandular trichomes (Baldi and Dixit, 2008). Similarly, external application of ABA enhances AN production, by stimulating the expression of several synthesis enzymes (Zhang et al., 2009, 2015b). Furthermore, action of SA, a phenolic plant hormone involved in plant development, transpiration, ion uptake and transport, has also been implied in the plant's response to different abiotic/biotic stresses (Rao and Davis, 1999; Bulgakov et al., 2002; Hayat and Ahmad, 2007). Generally, the mechanisms of plant defense are related to the increase of H2O<sup>2</sup> and reactive oxygen species (ROS) levels (Lamb and Dixon,

1997; Ebel and Mithöfer, 1998). Some studies have revealed that SA applications are able to increase AN content in A. annua within a 54% in two different ways: firstly by converting the dihydroartemisinic acid into AN, due to the burst of ROS, and secondly by positively affecting the expression of both AN-related biosynthetic enzymes (Pu et al., 2009).

According to these findings, GA is the hormone that plays the most important role in promoting AN synthesis. In A. annua, AN production may in fact increase around 300–400% after exogenous GA treatment, which also positively affects trichome proliferation (Paniego and Giulietti, 1996). Despite the AN and GA pathways taking place in the cytosol and plastids respectively, both of them are well interconnected: an excess of bioactive GA has been described as resulting in carbon being diverted toward an efficient AN production, (Zhang et al., 2005). This assumption is supported by the fact that the levels of transcripts of FDS, ADS and CYP71AV1 increase after GA treatment (Banyai et al., 2011; Maes et al., 2011). Consequently, these promising insights regarding exogenous GA application should be taken into consideration as an important tool for future cost-effective AN biosynthesis. Indeed, the GA biosynthetic pathway has been extensively characterized, and most of the genes encoding for the biosynthetic enzymes have been well studied in other plants (Olszewski et al., 2002). Therefore, significant efforts should be addressed toward this complex regulatory network that leads to the final production of the bioactive form of GA. Surprisingly, recent data has shown that GA biosynthetic inhibitors have an interesting, direct, inhibitory effect on in vitro growth of the malaria parasite (Toyama et al., 2012). Indeed, treatment with GA inhibitors resulted in morphological changes in the parasite membrane permeability that, if not reversed, fatally injured the parasite (Toyama et al., 2012); revealing an interesting new role for GA in the fight against malaria.

It is also known that AN levels increase with trichome number, density, and maturation. Furthermore, it is known that hormones also control these processes, even if in an independent manner. On one side, ABA, JA, SA and mainly GA control trichome proliferation, by regulating the expression of key A. annua Transcription Factors (TFs) that control trichome initiation and AN synthesis - a topic that will be discussed in detail later in this review (Smyth et al., 1990; Dill and Sun, 2001; Kautz et al., 2014; Tian et al., 2014). On the other side, these hormones also interact with and directly control key enzymes of the AN biosynthetic pathway (**Figure 4**).

Not only endogenous hormones but also other substances, such as exogenous sugars, have a positive effect on AN biosynthesis (Arsenault et al., 2010b). However, the role of sugars in AN is complex and sometimes confusing. While sucrose and glucose increase transcript levels of the main AN biosynthesis enzymes, when fructose is added, AN content is significantly reduced (Arsenault et al., 2010b). Additionally, chemical substances such as arsenic, chromium and NaCl also induce AN biosynthesis in A. annua (Paul and Shakya, 2013). All these substances were found to significantly increase AN biosynthesis, by affecting the regulation of some of the AN biosynthetic pathway genes. Further evidence strongly suggests that this AN increase is due to the fact that all these substances induce stress in the plant (Paul and Shakya, 2013). Regarding this effect, several reports have shown how exposure of A. annua to different abiotic treatments can induce AN biosynthesis. Light, low temperature, salinity, drought, heavy metals and/or other abiotic compounds trigger the generation of active oxygen species (AOS), facilitating the transformation from AN precursors to AN (Wang et al., 2001; Guo et al., 2004; Qureshi et al., 2005; Qian et al., 2007; Pu et al., 2009) (**Figure 4**). The assumption that abiotic stress positively affects AN content in the plant may be explained by the correlation with trichome function. As previously mentioned, the function of both glandular and non-glandular trichomes in A. annua is to defend plants against different potential damaging factors using different mechanisms. Consequently, we suggest that all these abiotic factors that are found to increase AN content might be due, at least partially, to an increase in the number of trichomes, as a defense mechanism in response to this stress (Valkama et al., 2004; Wu et al., 2006; Magnan et al., 2008; Sharma et al., 2011). This physiological response may have an indirect effect in the case of glandular trichomes, as this higher trichome density will increase the quantity of secondary metabolites produced. Supporting evidence to this end is that trichomes from many plant species, including A. annua, are involved in defending plants against UV light radiation (Traw and Bergelson, 2003), and previous studies have shown how UV light may induce AN production (Rai et al., 2011).

Plants protect themselves from UV stress by producing UVabsorbing compounds, such as flavonoids in the leaf epidermis and trichomes (Rozema et al., 1997; Kumari et al., 2009). Pretreating A. annua plant's with UV-B and UV-C lead to slight AN increases of 10.5 and 15.7%, respectively. This improvement is not only due to trichome proliferation but also to the alteration of the activity of most of the AN pathway enzymes (Rai et al., 2011). However, considering the low increase in AN production, together with the toxicological UV-C potential, this pre-treatment is not recommended for commercial issues (Rai et al., 2011).

## Artemisia annua New Varieties and Vegetative Propagation

Asexual, or vegetative, in vitro propagation is a technically easy and cheap method of propagation, used in agriculture and industry for large-scale production of high-value metabolites. It has a several advantages over seed propagation: it retains the genetic constitution of the plant type almost completely and is a less time-consuming process. Moreover, different plant cell cultures can be used for this purpose. So far, A. annua cell propagation has been realized using different cell types. In vitro propagation of A. annua hairy roots has given the best results (Jaziri et al., 1995; Liu et al., 1997) (**Figure 3**). Hairy roots are genetically stable and generally show better biosynthetic potential for secondary metabolites compared with other tissues (Majdi et al., 2015). Interestingly, root hair morphology shares similar genetic regulation to that of trichomes, which may somehow explain these positive results. Furthermore, in vitro production of AN can be enhanced by treating cell cultures

with different elicitors, such as 2,6-di-O-methyl-cyclodextrin (DIMEB) or the aforementioned phytohormones JA, GA and SA (Paniego and Giulietti, 1996; Durante et al., 2011; Majdi et al., 2015). Unfortunately, despite the potential of these tools and the significant effort that has been made regarding A. annua cell cultures, none of these methods are commercially available.

During the last decades, A. Annua crop-breeding has produced new varieties (**Figure 3**). Indeed, A. Annua varieties can be sorted into two chemotypes: the high artemisinin producers (HAPs) and the low artemisinin producers (LAPs) (Brown and Sy, 2004, 2006). HAPs include such varieties as Chongqing, Anamed, Artemis and 2/39 that produce more AN than arteannuin B, another derivate from artemisinin acid but without therapeutic value (Covello, 2008; Brown, 2010; Reale et al., 2011). These varieties have an average AN content that is twice that of wild type A. annua plants. Moreover, the University of York's Centre for Novel Agricultural Products (CNAP), has recently registered a new HAP variety, Hyb8001R, which will be commercialized in China. Contrarily, LAP varieties produce more arteannuin B than AN and include different Iran and Meise varieties. Some recent evidence supports the hypothesis that the chemotype is determined mainly by the activity of the DBR2 enzyme activity, whereas HAP varieties show a much higher activity of that enzyme than LAP varieties do (Yang et al., 2015). In order to reach more economically feasible AN production, further studies may focus on the creation of new A. annua varieties with an AN content that is much higher than the present varieties.

## Unraveling the Artemisia annua Transcription Factor Genetic Engineering Network

Significant but insufficient advances have been made in terms of metabolic engineering for cloning the AN pathway in tobacco, yeast and bacteria (**Figure 3**). However, only a few studies so far have been carried out to study the regulation of this pathway by TFs-encoding genes. Since the first reports of successful Agrobacterium-mediated transformation of A. annua in the 1990s (Vergauwe et al., 1996; Banerjee et al., 1997), these transformation protocols have been optimized (Han et al., 2005). Generally, TFs regulate the expression of a certain number of genes from specific and/or related pathways (Borevitz et al., 2000); therefore loss and gain of the function of TFs has arisen as a promising biofarming approach for more efficiently regulation of secondary metabolite production (Verpoorte and Memelink, 2002; Petersen, 2007) (**Figure 3**).

In recent years, a few TFs have been characterized as regulators of the transcriptome, for controlling and regulating different enzymatic steps through AN biosynthesis (**Figure 4**). AaWRKY1 was the first TF to be identified and characterized in A. annua (Ma et al., 2009). The constitutive and trichome-specific expression of AaWRKY1, driven by the promoters CaMV35S and CYP71AV1, respectively, dramatically increases the transcript levels of CYP71AV1, but does not clearly affect the transcription levels of FDS, ADS, and DBR2 (Ma et al., 2009). Additionally, the AaWRKY1 protein can bind the regulatory region of the ADS promoter (Han et al., 2014). As a result, in these transgenic plants, AN content is 1.8 times increased compared with wild-type A. annua (Ma et al., 2009).

Other studies have revealed how other TFs, belonging to the bHLH and AP2/ERF families; regulate biosynthetic genes (Yu et al., 2011; Lu et al., 2013a; Ji et al., 2014; Tan et al., 2015). AabHLH1 was isolated from a cDNA library obtained from glandular trichomes in A. Annua. Transient overexpression of AabHLH1 in leaves increases expression levels of ADS and CYP71AV1, the two key enzymes, thereby positively regulating AN biosynthesis. Indeed, biochemical analyses have shown that AabHLH1 protein is able to bind in vivo with the E-box cis-elements, present in both ADS and CYP71AV1 promoters (Ji et al., 2014). Among all the TFs families, the best characterized in A. annua is the AP2/ERF family. Four members of this family Ethylene Response Factor1 (AaERF1), Aa-Ethylene Response Factor2 (AaERF2), Trichome and Artemisinin Regulator1 (AaTAR1) and AaORA were found to also directly affect the AN biosynthesis pathway (Yu et al., 2011; Lu et al., 2013a; Tan et al., 2015). Two JA- and ethylene- responsive AP2 family members, AaERF1 and AaERF2, that are highly expressed in A. annua inflorescences, are also able to increase twofold the accumulation of AN and artemisinic acid when overexpressed (Yu et al., 2011). In contrast, RNAi lines that partially silence AaERF1 and AaERF2 decrease both content of both metabolites, by directly controlling ADS, CYP71AV1 and moderately DBR2 transcript levels (Yu et al., 2011).

AaTAR1 plays an essential role, not only in regulating AN biosynthesis but also in other biological processes. Similarly to the other TFs described so far, AaTAR1 controls ADS and CYP71AV1 expression, by binding to their regulatory regions. When AaTAR1 is silenced, AN content is dramatically reduced and cuticular wax distribution is altered (Tan et al., 2015). In addition, AaTAR1 controls both glandular and non-glandular trichome initiation and development in A. annua. The last AP2/ERF member studied so far, AaORA, positively regulates the transcript levels of ADS, CYP71AV1, and DBR2, as well as AaERF1. As a consequence, AN content in these plant lines is regulated as well (Lu et al., 2013a). Interestingly, overexpression of AaORA increases the expression levels of diverse genes involved in different, but still related, physiological aspects - such as defense. Phenotypical analyses demonstrate that the AaORA protein is a positive regulator of resistance to Botryris cinerea, by modulating the expression of defense marker genes such as PLANT DEFENSIN1.2 (PDF1.2), HEVEIN-LIKE PROTEIN (HEL) and BASIC CHITINASE (B-CHI) (Lu et al., 2013a). Simultaneously, it was found that AaERF1 also confers resistance to Botrytis cinerea by activating some of the defense genes via the JA and ethylene signaling pathways in A. annua (Lu et al., 2013b).

Recent results obtained using A. annua indicate that the TFs identified so far, that function in AN regulation, may be involved in regulating other biological and developmental processes such as trichome proliferation. Since TFs from other plant species are known to be specialized in the regulation of protective secondary metabolite biosynthesis, they are also able to affect phyto-hormone biosynthesis and signaling (Wolucka et al., 2005; Tuteja, 2007; De Boer et al., 2011). There is clear

correlation between AN biosynthesis and ABA signaling, as well as overexpression of AaPYL9; an ABA receptor ortholog, increases AN production (Zhang et al., 2015b). Moreover, it has been shown recently how AabZIP1, a basic leucine zipper TF, connects ABA signaling to AN accumulation (Zhang et al., 2015b). As previously mentioned, JA-responsive AP2 family members, AaERF1 and AaERF2, directly control expression levels of ADS, CYP71AV1 and, to some extent, DBR2 (Yu et al., 2011).

Despite significant advances being made into the elucidation of the complex genetic network controlling AN biosynthesis, none of the strategies using loss and gain of TF function has resulted in an efficient enough increase in AN content to be considered for commercial purposes. This limitation might be biochemically explained due to several facts, such as transgenic gene silencing, enzymatic limiting-steps and synthesis/degradation of other final products of the pathway that compete with AN for precursor sources, for example arteannuin B or other terpenoids (Bertea et al., 2005; Zhang et al., 2008; Teoh et al., 2009). Therefore, further studies in this field are essential for dealing with these issues and overcome these difficulties.

## Terpenoid Enzymes Studies in Artemisia annua

Due to these limitations, new studies have been conducted to better elucidated the complex regulation of the AN biochemical pathway. For decades, it was believed that FPP precursors (DMAPP and two units of IPP) were only biosynthesized exclusively in the cytosol via mevalonate from acetyl-CoA (Covello, 2008; Arsenault et al., 2010a,b; Graham et al., 2010; Kiran et al., 2010). However, recent results have revealed that despite AN biosynthesis utilizes carbon mainly from the mevalonate pathway, some of the carbon sources for AN synthesis are also provided by plastids (Towler and Weathers, 2007; Schramek et al., 2010). Indeed, a non-mevalonate biosynthetic pathway that occurs in plastids was also found to be a source of terpenes in other plant species, via 1 deoxyxylulose 5-phosphate and 2C-methylerythritol 4-phosphate precursors (Rodriguez-Concepcion, 2006; Rohmer, 2007). In addition, recent studies have provided evidences for metabolic crosstalks between both compartments (Dudareva et al., 2005; Skorupinska-Tudek et al., 2008; Schramek et al., 2010). A. annua plants treated with specific inhibitors of mevalonate and nonmevalonate pathway revealed that AN biosynthesis decreased when used any of these inhibitors (Towler and Weathers, 2007); concluding that both pathways are involved in AN formation. Moreover, further studies suggested that DMAPP from cytosolic mevalonate origin is transferred to the plastids, where one of the IPP unit of non-mevalonate origin is used to form geranyl diphosphate (GPP) (Schramek et al., 2010). Once GPP is synthetized, it is exported again to the cytosol other IPP unit but from mevalonate origin it is used to convert GPP to FPP (Schramek et al., 2010).

Consequently, it has been hypothesized that, if carbon resource comes from these two mevalonate and non-mevalonate pathways, some key enzymes from this may also play essential roles in regulating the synthesis and degradation of the final products from the AN pathway. It is well known that for a specific pathway that competes with another one for common precursors, as in this case is the FDP availability, the accelerated conversion of this precursor to the product of interest, using genetic engineering, may minimize the conversion of the precursor to other competing molecules (Majdi et al., 2015). During the past years, a few rate-limiting endogenous enzymes involved in terpenoid biosynthesis have been overexpressed (**Figure 3**). This was the case with FPS or isopentyl transferase (IPT) enzymes from A. annua and FDP synthase from Gossypium hirsutum that were cloned and overexpressed in A. annua (Chen et al., 2000). However, and similar to considering the loss and gain of function of TFs, when these enzymes were constitutively overexpressed in the plant, the highest AN content in these transgenic plants was increased two- to three-fold (Ma et al., 2009; Liu et al., 2011; Han et al., 2014). These results indicate, once again, that AN biosynthesis might be strongly regulated by other unknown factors. Indeed, similar evidence was obtained when the expression of squalene synthase and β-caryophyllene synthase – enzymes that compete for FDP with ADS was repressed: A. annua plants in which these enzymes were silenced showed a 1.5- to 2-fold increase in AN content (Wang et al., 2012).

Fortunately, when the cellular mevalonate pool and its channelization toward AN biosynthesis was enhanced by overexpression of the enzyme 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGR), a much better result was obtained (Alam and Abdin, 2011). HMGR is considered to be the rate-limiting enzyme of the mevalonate pathway, that converts HMG-CoA to mevalonic acid at the beginning of isoprenoid biosynthesis in the cytosol (Chappell, 1995; Argolo et al., 2000). Mevalonic acid serves as the common precursor for the synthesis of different secondary metabolites, including: sesquiterpenoids, triterpenoids, sterolsand phytoalexins, from different plant species (Chappell, 1995; Argolo et al., 2000; Ayora-Talavera et al., 2002). When HMGR from Catharanthus roseaus was overexpressed, together with ADS, in transgenic A. annua plants, AN content increased 7.65-fold (Alam and Abdin, 2011). This result strongly suggests that it is crucial to take into consideration other limiting-rate factors, upstream in the pathway, in order to divert as much carbon resources as possible toward AN biosynthesis. Indeed, the latter strategy was also used in tobacco and showed promising results. When different enzymes, including not only HMGR but also those involved in AN synthesis, were overexpressed in tobacco, AN was finally produced for the first time in another plant (Farhi et al., 2011). Interestingly, in these transgenic tobacco plants, ADS was not only expressed in the cytosol but also in the mitochondria, using a COX4 transit peptide; suggesting the potential of plastid transformation.

## BIOFARMING ARTEMISIA ANNUA FUTURE STRATEGIES

There is great concern among the international health community regarding the onset of AN resistance in the

malarial parasite. Even though synthesized AN has been in use for less than 20 years, the first cases of parasite resistance have been already identified. However, A. annua tea has been used in china for the last 1000 years without any resistance development. This could be explained by the fact that artificial AN has sometimes been wrongly used as a monotherapy, while A. annua, in addition to AN, contains other anti-malarial substances, such as artemetin, casticin, cirsilineol, chrysoplenetin, sesquiterpenes, and flavonoids. These compounds work in synergy with AN, reducing the possibility of the parasite developing resistance (Willcox, 2004). Moreover, it has also been proven that an A. annua infusion has the additional effect of strengthening the immune system, which could bring extra benefits to local people in areas affected by malaria (Willcox et al., 2005). The synergic action of the different compounds in A. annua suggests that special attention should be directed toward plant biofarming in the future.

Some of the brightest prospects for the success of the plant biofarming field include the plant-made viral vaccines, or desired peptides, that are the earliest products of this new technology (Viana et al., 2012; Rybicki, 2014). The success of these approaches is not only based on the production rate increase of the desired molecule but also on the reproduction of desirable post-translational modifications that reduce the risk of allergenicity (Viana et al., 2012). Unfortunately, early molecular biofarming approaches, based on genetic and metabolic engineering, to increase AN content in different plants and microorganisms have not been as effective as was expected. However, recent reports have highlighted new, promising insights for finally reaching a more cost-effective approach. Genetic and metabolic engineering studies indicate that, despite AN biosynthesis being strictly regulated, it is still possible to modulate it using external factors as well as genetic ones. External application of diverse plant phytohormones, specially GA, or sugar concentrations, together with abiotic factors, could optimize AN production. However, recent results concerning AN production using genetic engineered plants evidence the enormous potential of biofarming for obtaining economically feasible AN synthesis. At the gene level, and similar to biosynthesis of other plant secondary metabolites (Zhang et al., 2015a), overexpressing only the full AN pathway was not sufficient for a significant increase in AN production, and, therefore, rendered AN production in plants uneconomical. This limitation in AN production might be explained by some enzymatic limiting-steps, as well as by competition for precursor sources with other terpenoids, or AN-derivatives, such as arteannuin B. Further studies that have been conducted addressing these issues have revealed promising results. Indeed, evidence has recently shown that the mevalonate pathway could be one of the most efficient biopharming approach used so far to increase AN production using genetic engineering. As strong competition exists among the different pathways for the available mevalonate products, it is therefore crucial to take the ratelimiting factors of carbon diversion, as Alam and Abdin (2011) have proved, into consideration. By overexpressing HMGR and ADS, the rate-limiting enzymes of the mevalonate and AN pathways respectively, A. annua plants increase AN content more seven-fold; something that has not been achieved using any other strategies. This evidence suggests that, nowadays, A. annua biofarming is finally starting to optimize strategies for production of effective bioengineering AN. Therefore, further research should be addressed toward key-limiting enzymes from other terpenoids precursor sources from either mevalonate or non-mevalonate pathways.

There is also a growing interest in applying proteomics and genomics to A. Annua, but one of the biggest handicaps of these techniques is the lack of availability of well-annotated databases. The in silico comparison of different A. annua databases, including: the EST trichome library; A. annua trichome Trinity contig database; Uni/Prot/A. annua; and UniProt/viridiplantae, have been useful tools for identifying important enzymes and TFs. However, these tools have also revealed significant differences in their suitability for genomic and proteomic analyses. Despite these differences, the EST trichome library has allowed identification of essential proteins, enriched in the A. annua trichomes, that are involved in biosynthesis and regulation of AN, as well as other related enzymatic processes (Bryant et al., 2015). Fortunately, the imminent release of the entire genome of A. annua will resolve this challenge, provide benefits to the scientific community and offer a better understanding of the genetic machinery regulation for AN production.

Finally, further biofarming efforts should be addressed toward different physiological aspects, such as different plant cell systems and compartments that might be used for large scale production of AN or other useful metabolites. As with many other metabolites of high pharmaceutical value, AN is toxic to the plant itself, as it is able to inhibit cell division and tissue growth (Dayan, 1998). AN is therefore exclusively produced in the glandular trichomes of A. annua, since these are independent compartments that are isolated from the rest of the plant. Specifically, the expression of the AN enzymes is active in both apical and subapical cells of the trichomes, while AN and its precursors accumulate in the subcuticular cavity of the glandular trichomes. Future biofarming strategies should also, therefore, pay particular attention to the initiation and development of trichomes. Genetic engineering might be also used as a useful and sustainable tool for AN production, by modifying diverse physiological aspects of the trichomes, such as leaf area, trichome number, density and alteration of the morphology of different cell types that form glandular trichomes. In conclusion, it is essential to keep in mind the use of different subcellular compartments, such as plastids, that could be used as efficient tools to exponentially elevate AN production in different plant species. Studies of different plant species have further revealed that redirection of the mevalonate pathway away from the cytosol, to plastid compartments, such as chloroplasts and mitochondria, is a new and potent approach for increasing sesquiterpene production from 100–10,000 times (van Herpen et al., 2010; Liu et al., 2011). The expression of foreign genes in the chloroplast also allows there to be almost 10,000 genome copies per cell, without the need for a signal peptide or no-gene-silencing possibility, (Bendich, 1987; Hasunuma et al., 2008; Kumar et al., 2012). Consequently, this novel biofarming approach, if properly used, may have enormous potential for economically feasible and sustainable AN production.

#### AUTHOR CONTRIBUTIONS

fpls-07-00329 March 16, 2016 Time: 15:17 # 13

GP and LM-H conceived and designed the research for this review. GP, SP, and LM-H wrote the manuscript. LM-H. supervised the research and the writing of the manuscript.

#### REFERENCES


#### FUNDING

This work was supported by grants from MINECO/FEDER (BIO2013-50388-EXP) from SP research group; a group that has been recognized as a Consolidated Research Group by the Catalan Government (2014 SGR 1406).

#### ACKNOWLEDGMENTS

We thank Michela Osnato for critical reading of the manuscript and Amanda Gillies for English revision and editing.



North Indian Buchpora and South Indian Eastern Ghats. Malar. J. 7, 44–65. doi: 10.1186/s12936-015-0564-z




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Pulice, Pelaz and Matías-Hernández. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Putting the Spotlight Back on Plant Suspension Cultures

#### Rita B. Santos <sup>1</sup> , Rita Abranches <sup>1</sup> , Rainer Fischer 2, 3, Markus Sack <sup>3</sup> and Tanja Holland<sup>2</sup> \*

<sup>1</sup> Plant Cell Biology Laboratory, Universidade Nova de Lisboa, Instituto de Tecnologia Química e Biológica António Xavier, Oeiras, Portugal, <sup>2</sup> Fraunhofer-Institut für Molekularbiologie und Angewandte Oekologie (IME), Integrated Production Platforms, Aachen, Germany, <sup>3</sup> Biology VII, Institute for Molecular Biotechnology, RWTH Aachen University, Aachen, Germany

Plant cell suspension cultures have several advantages that make them suitable for the production of recombinant proteins. They can be cultivated under aseptic conditions using classical fermentation technology, they are easy to scale-up for manufacturing, and the regulatory requirements are similar to those established for well-characterized production systems based on microbial and mammalian cells. It is therefore no surprise that taliglucerase alfa (Elelyso®)—the first licensed recombinant pharmaceutical protein derived from plants—is produced in plant cell suspension cultures. But despite this breakthrough, plant cells are still largely neglected compared to transgenic plants and the more recent plant-based transient expression systems. Here, we revisit plant cell suspension cultures and highlight recent developments in the field that show how the rise of plant cells parallels that of Chinese hamster ovary cells, currently the most widespread and successful manufacturing platform for biologics. These developments include medium optimization, process engineering, statistical experimental designs, scale-up/scale-down models, and process analytical technologies. Significant yield increases for diverse target proteins will encourage a gold rush to adopt plant cells as a platform technology, and the first indications of this breakthrough are already on the horizon.

#### Edited by:

Domenico De Martinis, ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### Reviewed by:

Heiko Rischer, VTT Technical Research Centre of Finland Ltd., Finland Karen Ann McDonald, University of California, Davis, USA

#### \*Correspondence:

Tanja Holland tanja.holland@ime.fraunhofer.de

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 15 December 2015 Accepted: 25 February 2016 Published: 11 March 2016

#### Citation:

Santos RB, Abranches R, Fischer R, Sack M and Holland T (2016) Putting the Spotlight Back on Plant Suspension Cultures. Front. Plant Sci. 7:297. doi: 10.3389/fpls.2016.00297 Keywords: plant suspension cultures, biopharmaceuticals, BY-2, protein production, plant cell cultures

# INTRODUCTION

Protein-based drugs are big business. The market for biopharmaceuticals is growing faster than the pharmaceuticals market as a whole, and a recent projection suggested the value of this segment could reach \$US 278.2 billion by 2020 (PMR, 2015). There are more than 200 approved biopharmaceuticals on the market today and many more in the clinical pipeline (Walsh, 2014). Currently, most biologics are produced in microbes or mammalian cells growing in fermenters. Microbes are simple and inexpensive but often fail to produce complex proteins or those requiring specific post-translational modifications, whereas mammalian cells can achieve these folding and modification tasks with aplomb but only at a much higher cost. Both systems also have the potential for undesirable contaminants—endotoxins in the case of bacteria, and viruses or other pathogens in the case of mammalian cells. The extra steps required during downstream processing to remove these contaminants can increase production costs even further (**BOX 1**).

The choice of expression hosts has more recently expanded to include plants because they offer unique features compared to the current dominant production systems (Stoger et al., 2014; Ma et al., 2015). The production of recombinant proteins in plants, where the protein itself is the

#### BOX 1 | COMPARISON OF MAJOR PRODUCTION PLATFORMS

Industry platforms for the production of recombinant proteins are based mainly on microbes and mammalian cells. The major microbial system is the bacterium Escherichia coli which was the first species used to produce a recombinant human protein (somatostatin in 1977, Itakura et al., 1977) and the first to be used for the production of a commercial therapeutic protein (recombinant human insulin, approved in 1982 and marketed by Eli Lilly & Co. under license from Genentech). Many simple and unmodified proteins are produced commercially in E. coli but more complex proteins are difficult to fold unless targeted to the periplasm and this is not a scalable process (Baneyx and Mujacic, 2004; Choi and Lee, 2004). E. coli is simple and inexpensive but problems include the accumulation of proteins as insoluble inclusion bodies and the production of endotoxins that can cause septic shock. Yeasts are sometimes preferred because they share the advantages of bacteria but they are eukaryotes and thus support protein folding and modification, although the glycan chains are often longer than in mammals. Saccharomyces cerevisiae was the first yeast used to express recombinant proteins and it is still used commercially to produce a Hepatitis B virus vaccine, but other yeasts such as Pichia pastoris and Hansenula polymorpha are now favored during process development because they are more suitable for in-process inducible expression (Gerngross, 2004). Mammalian cells have dominated the biopharmaceutical industry since the 1990s because they can produce high titers (1–5 g/L) of complex proteins with mammalian glycan structures (Chu and Robinson, 2001). They are much more expensive than microbes but most pharmaceuticals are glycoproteins and the quality of the product is superior when mammalian cells are used. CHO cells are preferred by the industry but others that are widely used include the murine myeloma cells lines NS0 and SP2/0, BHK and HEK-293, and the human retinal line PER-C6. The major disadvantage of mammalian cells remains the cost of production, purification, and the risk of contamination with human pathogens.

#### BOX 2 | DIVERSITY OF MOLECULAR FARMING TECHNOLOGIES

The immense diversity of molecular farming systems reflects the fact that recombinant proteins have been produced in many different plant species wherein there is a choice of whole plants or various cell/tissue culture formats (Twyman et al., 2003, 2005). Each of these may be suitable for stable expression (including nuclear and plastid transformation is some species) and transient expression (which can be achieved using Agrobacterium tumefaciens, plant virus vectors or combinations of both; Paul et al., 2013). Transgenic terrestrial plants are the most established platform and following a period of extensive diversification the field has now consolidated mainly to support tobacco as the primary leafy crop and the cereals maize, rice, and barley (Nandi et al., 2005; Tremblay et al., 2010; Sabalza et al., 2013). The main difference between these platforms is that leaves are watery tissues and the recombinant protein must be extracted quickly to avoid degradation whereas cereal seeds are desiccated and the protein remains stable for long periods. Cereal seeds are also suitable for direct oral administration. Aquatic plants such as duckweed and moss are also used as platforms (Reski et al., 2015). These have properties in common with terrestrial plants (differentiated whole plants) and cell suspension cultures (grown in containment in simple medium). The technology for aquatic plants and cell suspension cultures is similar but aquatic plants require light, whereas undifferentiated cell suspension cultures are grown in the dark but require a carbon source. After transgenic whole plants and cell suspension cultures, the third major technology platform is transient expression, which involves the introduction of non-integrating (episomal) vectors into leaves. The two main transient expression strategies are agroinfiltration, where leaves are infiltrated with A. tumefaciens by injection or vacuum leading to the transfection of millions of cells and the production of large amounts of recombinant protein in a short time (Komarova et al., 2010), and the use of recombinant plant viruses that infect cells directly, replicate within them and spread by cell-to-cell movement and systemic spreading through the vascular network to produce recombinant protein in every cell (Yusibov et al., 2006). A midway strategy that achieves biocontainment is the use of deconstructed virus genomes delivered by A. tumefaciens, which results in the transfection of many cells with the virus genome followed by its cell-to-cell movement but no systemic spreading (Peyret and Lomonossoff, 2015). All three major platforms have advantages and disadvantages—transgenic plants have a slow development cycle but are the most scalable, cell suspension cultures have a quick development cycle and allow contained production but are the least scalable, and transient expression allows the rapid production of high protein yields ideal for emergencies such as vaccines and prophylactic antibodies, as seen in the recent outbreak of Ebola virus disease in West Africa (Arntzen, 2015), but the large number of bacteria introduced into the leaves increases the endotoxin load (Arfi et al., 2015).

desired product, is often described as molecular farming. If the proteins are pharmaceuticals then a bit of wordplay offers molecular pharming as an alternative. Plants combine the advantages of higher eukaryotic cells (efficient protein folding and post-translational modification) with the use of simple and inexpensive growth media. The diversity of molecular farming technologies is much greater than other production platforms, which can be advantageous or disadvantageous depending on the perspective (**BOX 2**).

One niche of molecular farming technology that is now coming back into the limelight is the use of plant cells, specifically plant cell suspension cultures, rather than whole plants (Doran, 2000; Hellwig et al., 2004). Although molecular farming conjures up images of greenhouses bursting with dense green leaves containing valuable pharmaceutical proteins, much of the technical and commercial progress made in molecular farming has been based on plant cells. These combine the advantages of plants with those of traditional fermenter systems: contained, controlled and sterile production environments, chemically defined media lacking animal components, and compatibility with the toughest regulatory guidelines in existence—pharmaceutical good manufacturing practice (GMP). Recent advances in process engineering have seen plant cells leap forward toward commercial viability much faster than the established platforms achieved during their own development phases. The first molecular farming product approved for human use is manufactured in plant cells—and this is only the beginning (Zimran et al., 2011; Tekoah et al., 2015).

#### PLANT CELL SUSPENSION CULTURES—PLATFORMS AND PRODUCTS

The production of recombinant proteins in plant cell suspension cultures was first demonstrated more than 25 years ago (Sijmons et al., 1990) but progress over the subsequent decade was overshadowed by whole plants, and only a small number of studies involving cultivated plant cells as production hosts were published before the turn of the century (**Table 1**). The status of plant cells began to change after the first bubble of commercial interest in molecular farming collapsed due to the absence of a regulatory pathway, the opposition to GM crops (particularly in Europe), and the lack of support from an industry already heavily invested in fermenters. Whereas, some in the molecular farming community worked toward establishing regulations for pharmaceuticals derived from whole plants (Arfi et al., 2015; Ma et al., 2015; Sack et al., 2015) others realized that plant cells were already similar in many ways to microbial and mammalian cells and could be handled under the existing regulations (Ramachandra Rao and Ravishankar, 2002; Zimran et al., 2011).

The production of recombinant proteins in plant cell suspension cultures can be achieved by transforming wildtype cells already in suspension and selecting those carrying a co-introduced marker gene, or by initiating cultures from transgenic plants. As with other fermenter-based systems, the scalability of plant cell cultures is limited by the bioreactor capacity but the product can be recovered from the medium allowing continuous production, or it can be directed to a specific internal compartment if this is more appropriate (Schillberg et al., 2013). Although inducible promoters in plants allow production to be divided into a growth phase and a production phase analogous to the inducible production systems used in bacteria and yeast, there is currently no counterpart of the amplification technologies used with mammalian cells so the product yields in plant cells are much lower—however, the assembly of an artificial system in plants is conceivable (**BOX 3**).

Several platforms have emerged as contenders for a standardized production technology including cell suspension cultures derived from tobacco (Nicotiana tabacum), rice (Oryza sativa), and carrot (Daucus carota), which are the front runners today. The most widely used tobacco cell line is derived from the cultivar Bright Yellow 2 (BY-2). Tobacco BY-2 suspension cell cultures can multiply up to 100-fold within 7 days with a doubling time of 16–24 h under ideal conditions. The BY-2 cell line was developed in 1968 at the Hatano Tobacco Experimental Station, Japan Tobacco Company (Kato et al., 1972). The transformation of BY-2 cells using A. tumefaciens is highly efficient (Nagata et al., 1992) and therefore many different products have been successfully produced using this system (**Table 1**). One of the drawbacks of molecular farming in whole tobacco plants is that the leaves contain nicotine, but BY-2 cells do not produce significant amounts of this metabolite even when induced by jasmonates, and instead produce the related compound anatabine as well as low levels of other alkaloids (Shoji and Hashimoto, 2008).

Rice cell suspension cultures are used almost as widely as tobacco BY-2 cells due to the availability of the carbohydratesensitive α-amylase promoter system (RAmy3D) that works in synchronization with the fermentation cycle. This promoter is induced by sugar starvation, and gene expression can therefore be optimized by timing media exchanges so that cells are exposed to consecutive growth and production phases (Lee et al., 2007). Most rice varieties can be dedifferentiated but Japonica varieties appear more amenable than Indica varieties, such that callus can easily be produced from almost every part of the plant. Rice cell suspension cultures have a doubling time of 1.5–1.7 days (Trexler et al., 2005). Many pharmaceutical products have been expressed in rice cells (**Table 1**) and at least one major company has adopted rice cells as an industrial production platform, albeit for non-pharmaceutical-grade cosmetics ingredients and research reagents (Natural Bio-Materials, Jeollabuk-do, Korea; http://www.nbms.co.kr/).

Carrot cell lines can be derived from hypocotyl, epicotyl, or cotyledon tissues. The transformation of carrot cells can be achieved by co-cultivation with A. tumefaciens, particle bombardment or the electroporation of protoplasts (Rosales-Mendoza and Tello-Olea, 2015). The first plant-derived biopharmaceutical protein approved by the FDA for human use was taliglucerase alfa, produced in carrot cell suspension cultures by the Israeli company Protalix Biotherapeutics (http://www. protalix.com) and licensed by Pfizer (**Table 1**).

In addition to these three commercially-relevant platforms, several other plant species have been used to produce cell suspension cultures for molecular farming. The model legume Medicago truncatula (Abranches et al., 2005) is typically used for the analysis of secondary metabolism (Cook, 1999; Broeckling et al., 2005) but this species has been developed more recently for

#### BOX 3 | THE CHO AMPLIFICATION SYSTEM AND CAN WE REPLICATE IT IN PLANTS?

The CHO system is the most widely used mammalian cell line platform in the industry because it was the first to market and is therefore backed by years of cumulative experience and process optimization, and it is compatible with serum-free medium which reduces the potential bioburden (Wurm, 2004). Most of all it has a highly effective gene amplification system, paired with an unstable genome that facilitates amplification and other genetic changes (Cacciatore et al., 2010). This was discovered accidentally when rare individual CHO cells were shown to survive toxic concentrations of the drug methotrexate, which inhibits the enzyme dihydrofolate reductase (DHFR). The analysis of surviving cells showed that some carried point mutations conferring resistance but others contained multiple copies of the dhfr locus and produced enough of the enzyme to outcompete the inhibitor.

Stepwise selection at higher concentrations isolated cells with massively amplified dhfr gene arrays allowing survival at 10,000 times the normal toxic dose of methotrexate. The amplified genes were present as homogeneously staining regions within chromosomes or as small extra chromosomes called double minutes. Importantly, these arrays contain flanking regions as well as the dhfr gene itself so adjacent genes can also be amplified even if though they do not contribute to methotrexate-resistant phenotype (Cacciatore et al., 2010). The current industry CHO platform is based on the mutant cell line DG44 which lacks an endogenous dhfr gene. This cell line is transfected with a tandem dhfr-X construct, where X encodes the desired recombinant protein. Both genes are amplified under selection and the yield of the recombinant protein is boosted substantially. Many different amplifiable markers have been identified but only dhfr-methotrexate and glutamine synthase-methionine sulfoxamine are used for commercial pharmaceutical production.

There is no equivalent amplifiable marker system in plants although many of the markers which work in mammalian cells as amplifiable markers can be used for simple one-step selection in plants, including dhfr (Eichholtz et al., 1987). The failure of amplifiable selection therefore suggests that plants lack an intrinsic ability to generate massive arrays of small regions of the genome under selection, which indicates a difference in the capacity for homologous recombination. Such differences have been observed before, and explain the difference in gene targeting efficiency between mammals and plants (Puchta and Fauser, 2013). One potential solution to this issue is the use of extra chromosomal replicating vectors for amplification in plants, as reported by Regnard et al. (2010). Even without amplification, plant cells are moving toward parity with mammalian cells. For example, cell-specific production rates of 8 pg/cell/day have been reported for the monoclonal antibody M12 produced in tobacco BY-2 cells (Havenith et al., 2014) compared to typical production rates of 20–40 pg/cell/day for CHO cells carrying thousands of gene copies, showing that the difference between these systems is less than an order of magnitude.


TABLE 1 | Biopharmaceuticals produced in different plant cell suspension cultures.

SN, supernatant; TSP, total soluble protein, FW, fresh weight.

molecular farming because suspension cells can be derived from the mature leaf, root and seedling cotyledon, and transformation is highly efficient (Araujo et al., 2004). A cell line derived from this species was shown to achieve high recombinant protein yields (Pires et al., 2008) although only two biopharmaceutical products have been reported thus far (Pires et al., 2012, 2014). Other proteins have been produced in cell lines derived from tomato (Solanum lycopersicum), soybean (Glycine max), potato (Solanum tuberosum), sunflower (Helianthus annus), sweet potato (Ipomoea batatas), and medicinal plants such as Siberian ginseng (Eleutherococcus senticosus) and Korean ginseng (Panax ginseng). Biopharmaceuticals produced in these species are listed in **Table 1**.

#### PROGRESS AND CHALLENGES

Although plant cells are relative newcomers in the commercial environment and the yields they achieve still lag behind those of microbes and mammalian cells, it is important to remember that the yields produced by microbes and mammalian cells have increased substantially during their 30 years as industrial platform leaders. These increases have been achieved incrementally by several routes, including strain optimization, genetic modification to improve production characteristics, medium optimization, and process engineering (e.g., bioreactor design and fermentation conditions). In contrast, plant cells have been used commercially for less than 10 years and already the improvements have been striking, mainly because the lessons learned during the development of microbial and mammalian cell platforms have been applied to plant cells comparatively much earlier in their history as a platform technology, and novel approaches adopted by the industry more recently have been used with plant cells immediately, and implemented during early process development. These include strategies such as high-throughput clone selection, medium and process optimization using statistical experimental designs (typically design-of-experiments approaches) and the application of in-process monitoring systems, known as process analytical technology (PAT). Given the lead time, it is clear that current industry-standard mammalian cell lines such as Chinese hamster ovary (CHO) cells will remain superior in terms of overall yields for some time to come, but plant cells are gaining ground due to the many advantageous properties they offer (**Table 2**). The main challenges that plant cells still face are the absence of a gene amplification system comparable to the systems used with CHO cells (**BOX 3**) and the convenience of handling issues that are important for GMP compliance, such as cryopreservation and cell banking (Eck and Keen, 2009; Mustafa et al., 2011).

# SPECIFIC CHALLENGES—CELL CLUSTERS, GROWTH CHARACTERISTICS, AND CULTURE HETEROGENEITY

Almost all plant cell suspension cultures share one key property that sets them aside from microbial and mammalian cells—they do not grow as single cells but instead form clusters (Mavituna and Park, 1987; Nagata et al., 2013). Moreover, plant cells can grow significantly by elongation, increasing the volume and wet biomass without increasing the cell number. Both issues must be addressed by adopting specific methodologies. Although cell clusters can be advantageous, e.g., aggregation can be used as the basis for self-immobilization methods (Kieran et al., 1997; Kolewe et al., 2011), large cell clusters are generally undesirable because cells in the center may have limited oxygen and nutrient availability.

Cell clusters are also challenging during the generation of transgenic cell lines because monoclonal cultures cannot be generated by plating or limiting dilution. Cell suspension cultures generated de novo by the transformation of wild-type cells are always polyclonal because transformation is not 100% efficient and different cells can be transformed at different loci (Muller et al., 1996; Nocarova and Fischer, 2009). Even cell lines derived from transgenic callus are rarely monoclonal because the callus tissue may be chimeric. In both cases, the resulting transgenic cell lines can also undergo somaclonal variation, generating cell populations with heterogeneous expression levels (James and Lee, 2006). Therefore, even if an advanced technology such as the CRISPR/Cas9 system is used to specify a targeted integration site, one or more rounds of screening and selection is still necessary to identify and isolate the most productive cells to seed monoclonal production lines. Screening can be carried out at the callus stage and the use of fluorescent marker proteins facilitates the identification of chimeric callus tissue, allowing the selection of cell material for sequential rounds of sub culturing (**Figure 1**). An alternative is the preparation of protoplasts with subsequent selection by flow cytometry, although single protoplasts are


Overall cost: Low, \$20–100/g; Medium, \$50–1000/g; High, \$1000–10,000/g.

normal white light, (B,D) Images were taken under green light with a red filter for the macroscopic visualization of DsRed fluorescence.

fragile and plating on feeder cells is often required. Recently, flow sorting has been used to separate the most productive cells from a heterogeneous tobacco BY-2 cell culture producing a fulllength human antibody, by selecting the co-expressed fluorescent marker protein DsRed located on the same T-DNA (Kirchhoff et al., 2012). Using a feeder strategy, single protoplasts selected by flow cytometry were regenerated into stable monoclonal cell lines with homogeneous DsRed fluorescence and antibody yields up to 13-fold higher than the parent culture.

The productivity and growth characteristics of cell lines at the callus stage and in suspension are often unrelated thus raising additional challenges. Although fluorescent marker proteins can be used to screen callus tissue, it is good practice to continue screening the suspension cells under realistic production conditions to ensure a compromise between protein production and cell growth rates. CHO cell lines also display idiosyncratic behavior with respect to stability, media requirements and other process performance parameters. The current industry solution is to screen a sufficiently large number of clones under rigorous selection criteria to ensure that high-performance clones are identified, and this strategy is equally applicable to plant cell suspension cultures.

Because plant cells are large and tend to grow in clusters, it can be difficult to determine accurate cell densities. Plant cells are 50–200µM in length and range in morphology from spherical to cylindrical depending on the growth phase. Cells in the exponential growth phase undergoing rapid division are spherical or elliptical, with a length of 50–100µm, whereas those at the end of the exponential growth phase grow mainly by elongation and tend to be more cylindrical, with a length of up to 200µm (Mavituna and Park, 1987; Holland et al., 2013). Aggregation occurs when daughter cells fail to separate after cell division, and is promoted by extracellular polysaccharides. The tendency for form clumps varies between cell lines and depends on the age of the cells and the growth conditions. Cell counting is the most precise method to establish cell density but it becomes more difficult when the cells clump together. Alternative methods such as the measurement of turbidity or light scattering are also unsuitable due to the size of the clumps. Therefore, the density of plant cell suspension cultures is often determined by measuring the packed cell volume or wet cell weight after gentle centrifugation and aspiration of the supernatant. Alternatively, the pellet can be dried and cell density can be extrapolated from the dry weight. However, these are invasive and destructive off-line procedures. The use of noninvasive radio frequency impedance spectroscopy (RFIS) offers significant benefits because it can achieve continuous in-line realtime measurement suitable for PAT. Although RFIS measures the volume of viable cells, this parameter correlates well with the packed cell volume, wet cell weight, and dry biomass weight. Continuous measurement can also pinpoint the transition from cell division to cell elongation (Holland et al., 2013).

# SPECIFIC CHALLENGES—MEDIUM OPTIMIZATION

The productivity of cell suspension cultures can be improved by optimizing the expression construct and by selecting highlyproductive monoclonal cultures, but it is also necessary to optimize the culture conditions starting with the growth medium (Schillberg et al., 2013). Unlike whole plants, cell suspension cultures are not phototrophic so they require a carbon source. Plant cell media therefore usually contain sucrose, inorganic salts, vitamins, plant hormones and water, and a wide range of different media are commercially available depending on the species, growth characteristics, and purpose of the cultivation (Fawcett, 1954; Murashige and Skoog, 1962; Gamborg et al., 1968). In many cases, the growth medium is a variant of the MS recipe developed by Murashige and Skoog (1962), which provides nitrogen as a mixture of nitrate and ammonium salts. However, the addition of more nitrogen to MS medium can improve the productivity of BY-2 cells by up to 150-fold in the stationary phase, ultimately improving yields of recombinant proteins by up to 20-fold (Holland et al., 2010; Ullisch et al., 2012). In contrast to CHO cells, where medium optimization needs to be done for each product and cell line on a case-by-case basis (Wurm, 2004), for plant cell cultures it appears to similarly benefit all products (Holland et al., 2010; Ullisch et al., 2012). This was one catalyst for the introduction of statistical experimental designs that can simultaneously test the impact of varying several different medium components simultaneously, as well as other conditions such as pH, temperature and aeration rate. Accordingly, product-specific medium optimization achieved a five-fold increase in the yield of a recombinant antibody produced by tobacco BY-2 cells following the application of a statistical experimental design (Vasilev et al., 2013). The impact of changes in medium composition during fermentation, and the introduction of compensatory in-line adjustments, has also boosted product yields substantially. For example, a respiration activity monitoring system (RAMOS) revealed metabolic changes in cultivated BY-2 cells caused by ammonia depletion, and the replacement of this missing ammonia resulted in a 100% increase in product yields (Ullisch et al., 2012).

## SPECIFIC CHALLENGES—PROTEIN DEGRADATION

Degradation caused by intracellular and extracellular proteases reduces the yield and quality of biopharmaceuticals produced in plant cells, and extra purification steps are required to remove degradation products. Extracellular degradation can be avoided by targeting the protein to accumulate within an intracellular compartment, and the endoplasmic reticulum (ER) is often used for this purpose because complex proteins fold efficiently and accumulate to higher levels than those secreted to the apoplast, i.e., the space under the cell wall (Twyman et al., 2013). However, the benefits of intracellular accumulation must be balanced against two drawbacks—the need to extract the protein by breaking the cell, which releases more contaminants (including proteases) during downstream processing (Buyel et al., 2015), and the impact on glycosylation, which is discussed in the next section. A better approach is to allow secretion but to counter the effect of proteases directly. Plants produce hundreds of proteases and it is not always possible to identify which is responsible for degrading a recombinant protein, particularly because different products are susceptible to different protease classes (Mandal et al., 2010, 2014; Navarre et al., 2012; Niemer et al., 2014). If a particular protease can be identified then it may be possible to knock out the corresponding gene or co-express a protease inhibitor to prevent product degradation (Kim et al., 2008b; Benchabane et al., 2009). Decoy proteins such as gelatin or bovine serum albumin can also be added to the medium but proteinaceous additives from animal sources must be evaluated carefully because they pose a risk of contamination with prions, thus compromising the economic and regulatory advantages of plant cells (James et al., 2000; Baur et al., 2005). Non-protein additives such as polyvinylpyrrolidone (Magnuson et al., 1996; LaCount et al., 1997), Pluronic F-68 and polyethylene glycol (Lee and Kim, 2002) can also reduce the damage caused by proteases but may be difficult to remove in subsequent processing steps (Baur et al., 2005). Osmotic stress has also been proposed to increase product accumulation, although this may inhibit cell growth so the timing of application must be optimized carefully (Tsoi and Doran, 2002; Soderquist and Lee, 2005). Medium optimization and process control are promising tools to avoid protein degradation during production—for example, a balanced supply of nitrogen not only dramatically increases the amount of a secreted antibody but also stabilizes the secreted product toward the end of the cultivation process (Holland et al., 2010).

#### SPECIFIC CHALLENGES—PLANT GLYCANS

The early steps of protein glycosylation in plants and mammals are identical, but once a nascent protein moves from the ER to the Golgi apparatus subtle differences in the oligosaccharide structures begin to appear. Plant glycoproteins tend to contain core α1,3-fucose (rather than core α1,6-fucose which is present in mammals) and core β1,2-xylose, whereas mammalian glycoproteins contain β1,4-galactose and terminal sialic acid residues that are not present in plants (Gomord et al., 2010). Initially there was concern that plant glycans could be immunogenic in humans and much effort was expended to ensure that plant glycans were avoided. This involved either targeting the proteins to be retained in the ER resulting in generic high-mannose glycans, or engineering plant lines in which the glycosylation pathway was modified to abolish the enzymes responsible for plant glycans and, in some cases, introduce enzymes that produced human-like glycans instead (Castilho and Steinkellner, 2012; Bosch et al., 2013). The glycan panic has since abated given the lack of evidence that plant glycans are harmful in humans (Gomord et al., 2010). The first-in-class Protalix drug taliglucerase alfa (**Table 1**) contains the aforementioned core α1,3-fucose and β1,2-xylose residues but no adverse effects have been reported in clinical trials or post-market use (Tekoah et al., 2015).

Although plant glycans can affect the properties of recombinant proteins, including stability and functionality, in some cases the plant-derived version is superior—not biosimilar but bio-better. In the case of taliglucerase alfa, targeting the protein to the vacuole of carrot cells exposes terminal mannose residues that are required for the efficient uptake of the enzyme into macrophages by mannose receptors. The equivalent protein produced in CHO cells (imiglucerase, marketed as Cerezyme <sup>R</sup> ) has terminal sialic acid residues that prevent uptake, and these must be enzymatically removed in vitro during processing, which increases the production costs dramatically. The comparison between taliglucerase alfa and imiglucerase also highlights the safety advantages of plant cells. The production of imiglucerase in CHO cells by Genzyme was shut down for a significant time due to viral contamination in the production plant (European Medicines Agency, 2009). This resulted in an acute shortage of the product because the plant was responsible for ∼20% of the global supply at that time (Hollak et al., 2010).

#### SPECIFIC CHALLENGES—UPSTREAM PROCESSING STRATEGIES

Plant cell cultures are often successful in the laboratory because they can be grown in well-aerated shake flasks and the products can be extracted in small volumes of buffer, allowing the use of protease inhibitors and other expensive additives that cannot be used at the process scale. Cell line selection and optimization also tends to be carried out using small flasks or even microtiter plates, so scaling up production is a significant challenge (Fischer et al., 2015). Plant cell cultures have been cultivated in many different bioreactors, like stirred tanks reactors (STR's; Hooker et al., 1990; Doran, 1999; Trexler et al., 2002), wave reactors (Eibl and Eibl, 2006), wave and undertow (Terrier et al., 2007), bubble column (Terrier et al., 2007), single use bubble column reactor (Shaaltiel et al., 2007), air-life reactors (Wen Su et al., 1996), membrane reactors (McDonald et al., 2005), and rotation drum reactors (Tanaka et al., 1983). The homogeneous nature of plant cell suspension cultures requires a fermentation broth similar to that used for microbes and mammalian cells, which is fed with nutrients and oxygen and mixed to achieve an even distribution (Hellwig et al., 2004). In many reports, stirred-tank bioreactors with large impellers and a ring sparger have been used to reduce shear stress, delivering maximum biomass values of 60–70% packed cell volume. In the context of cell suspension cultures the bioreactor must be matched to the production line to synergize with its biological characteristics, e.g., growth rate, morphology, aggregation tendency, shear sensitivity, oxygen demand, and rheological properties. The relative merits of the different reactor designs have been intensively discussed and reviewed elsewhere (Xu et al., 2010; Huang and McDonald, 2012).

The cultivations in the different reactor designs covering a range of volumes and offering three main fermentation strategies: batch, fed-batch and continuous processes. The batch fermentation is the simplest and most commonly used process. A batch fermentation is filled with medium, inoculated and after inoculation the reactor is a closed system except for a few additives like oxygen and base/acid for controlling the pH-value. The cell culture undergoes a lag, exponential and stationary growth phase and cell growth generally occurs under varying and sometimes unfavorable conditions. A more advance fermentation modes is the fed-batch fermentation, which starts with a classical batch phase and once certain conditions are reached the feed is started, i.e., additional nutrients are provided. In fed-batch also several culture parameters are changing and the cells undergo classical growth phases. Furthermore, batch and fed-batch fermentations suffer from low running-to-set-uptimes ratio (preparation, sterilization before the cultivation, and cleaning afterwards), which can be compensated by investments of men-power and infrastructure. To overcome low runningto-set-up-times different continuous fermentation strategies for plant cell cultures has been developed. Classical continuous fermentations strategies are perfusion and chemostat processes. In perfusion processes the cells in the reactor are supported with continuous feed of fresh media and cell free fermentation broth is constantly being removed in the same volume. Perfusion fermentation have successfully realized with plant cell cultures (Su and Arias, 2003; De Dobbeleer et al., 2006). In a chemostat fermentation the culture is also being supported by a continuous feed of fresh medium but in this case the same volume of fermentation broth which is being removed also contains cells, thus in a perfusion the cell density is increasing while in a chemostat the cell density stays constant. A chemostat cultivation can run over a long time period and the cells are maintained at the exponential growth phase, which make this strategy attractive for large scale production (Miller et al., 1968; van Gulik et al., 2001). Despite the ideal characteristics of the continuous bioreactor, the process itself issensitive and subjected to influence from various factors such as risks of contaminations, genetic instability, and changes in the biotic phase of the bioreactor. To avoid these semi-continuous fermentation strategies have been developed, where a fraction of the fermentation broth is removed once and replaced by fresh medium (Hogue et al., 1990; Huang et al., 2010). The merits of the different cultivation strategies is summarized and discussed intensively from Xu et al. (2011). Some recent examples for large scale cultivations are the manufacture of antibodies in tobacco BY-2 cells, which has been scaled up from shake flasks to 200-L disposable bioreactors without loss of yield (Raven et al., 2015). Scaling up from 50-ml shake flasks to 600-L bioreactors (a factor of 12,000) has also been achieved without any impact on growth characteristics (Reuter et al., 2014). Taliglucerase alfa and other products in the Protalix pipeline are produced in carrot cells cultivated in bubble column-type bioreactors fitted with disposable polyethylene bags (Shaaltiel et al., 2007; Tekoah et al., 2015).

#### SPECIFC CHALLENGES—CELL BANKING

To meet the regulatory requirements it will be necessary to ensure cell line stability over the entire production process time. Many cell cultures are maintained by a weekly sub culturing routine and are stable over long time period, nevertheless there are only few early studies on long term production stability done (Gao et al., 1991; Sierra et al., 1992; Kirchhoff et al., 2012). Cell banking for the supply of well-defined starting material and a routine procedure for the cryopreservation of plant cells is one key feature to enable plant suspension culture as a biopharmaceutical production platform. The first successful cryo preservations of plant cell cultures have been reported in the late 1960s and 1970s (Quatrano, 1968; Nag and Street, 1973). Since then several protocols have been developed, e.g., for particular cell lines like BY-2 cells and tobacco cell lines (Menges and Murray, 2004; Schmale et al., 2006) or arabidopsis cells (Menges and Murray, 2004; Ogawa et al., 2008). Although different techniques have been published, including desiccation (Nitzsche, 1980), vitrification (Uragami et al., 1989) or encapsulation-dehydration cryopreservation (Bonnart and Volk, 2010), there is no protocol that can generally be applied to cell suspension of all plant species and all protocols need to be optimized on a case-by-case basis. Only few protocols have been verified for different cell species (Ogawa et al., 2012).

#### CONCLUSIONS

In the near future, plant cell suspension cultures will most certainly become the preferred choice among plant-based systems for the production of high-value recombinant proteins, because they combine the advantages of all other systems. Although plant cells have been overshadowed by whole-plant platforms, this trend has been inverted following the approval of taliglucerase alfa for use in human adults in 2012 and then for pediatric use in 2014 (Tekoah et al., 2015). This has opened the way to the full acceptance of this technology, and several other products are now undergoing clinical trials and are expected to reach the market in the near future. The significant number of drugs that are now coming off patent will contribute to this market expansion.

Plant-based systems still face one major bottleneck that needs to be overcome—their lower yields compared to mammalian cell cultures. This partly reflects the much more recent emergence of plant cells as a competitive platform, so there has been less investment thus far in strain, medium and process optimization compared to mammalian cells. However, there are many researchers currently working to address this challenge, and several recent reports discussed herein have made breakthroughs in the development of robust upstream production and downstream processing strategies. These developments include medium optimization, process engineering, statistical experimental designs, scale-up/scaledown models, and process analytical technologies. Overall, these optimization procedures will lead to higher yields and will put plant cell cultures back into the spotlight. Other factors that will also contribute to the success of plant cells include the straightforward compliance with GMP compared with whole plants, and the better public acceptance of biopharmaceuticals produced in cultivated cells than GM plants.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication. All authors contributed equally in the Introduction, Progress and challenges and conclusion parts. RS and RA contributed in the specific Challenges: Cell clusters, growth characteristics, and culture heterogeneity, protein degradation and in making and designing the tables. TH, MS, and RF contributed in the specific challenges: Medium optimization, plant glycans, upstream processing, and cell banking.

#### FUNDING

This work was funded by Fundação para a Ciência e Tecnologia (FCT, Portugal) through grants ERA-IB/0001/2012, PTDC/BIA-PLA/2411/2012, and UID/Multi/04551/2013 and by the Fraunhofer Future Foundation project "Innovative technologies to manufacture ground-breaking biopharmaceutical products in microbes and plants" (125-300004).


from genetically modified tobacco cells in suspension culture. Protein Expr. Purif. 13, 45–52. doi: 10.1006/prep.1998.0872


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Santos, Abranches, Fischer, Sack and Holland. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

#### *Naomichi Fujiuchi1 , Nobuyuki Matoba2,3 and Ryo Matsuda1 \**

*1Department of Biological and Environmental Engineering, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan, 2Owensboro Cancer Research Program, James Graham Brown Cancer Center, University of Louisville School of Medicine, Owensboro, KY, USA, 3Department of Pharmacology and Toxicology, University of Louisville School of Medicine, Louisville, KY, USA*

#### *Edited by:*

*Edward Rybicki, University of Cape Town, South Africa*

#### *Reviewed by:*

*Surinder Singh, Commonwealth Scientific and Industrial Research Organisation, Australia Markus Sack, RWTH Aachen University, Germany*

*\*Correspondence: Ryo Matsuda amatsuda@mail.ecc.u-tokyo.ac.jp*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Bioengineering and Biotechnology*

*Received: 15 November 2015 Accepted: 22 February 2016 Published: 08 March 2016*

#### *Citation:*

*Fujiuchi N, Matoba N and Matsuda R (2016) Environment Control to Improve Recombinant Protein Yields in Plants Based on Agrobacterium-Mediated Transient Gene Expression. Front. Bioeng. Biotechnol. 4:23. doi: 10.3389/fbioe.2016.00023*

*Agrobacterium*-mediated transient expression systems enable plants to produce a wide range of recombinant proteins on a rapid timescale. To achieve economically feasible upstream production and downstream processing, two yield parameters should be considered: (1) recombinant protein content per unit biomass and (2) recombinant protein productivity per unit area–time at the end of the upstream production. Because environmental factors in the upstream production have impacts on these parameters, environment control is important to maximize the recombinant protein yield. In this review, we summarize the effects of pre- and postinoculation environmental factors in the upstream production on the yield parameters and discuss the basic concept of environment control for plant-based transient expression systems. Preinoculation environmental factors associated with planting density, light quality, and nutrient supply affect plant characteristics, such as biomass and morphology, which in turn affect recombinant protein content and productivity. Accordingly, environment control for such plant characteristics has significant implications to achieve a high yield. On the other hand, postinoculation environmental factors, such as temperature, light intensity, and humidity, have been shown to affect recombinant protein content. Considering that recombinant protein production in *Agrobacterium*-mediated transient expression systems is a result of a series of complex biological events starting from T-DNA transfer from *Agrobacterium tumefaciens* to protein biosynthesis and accumulation in leaf tissue, we propose that dynamic environment control during the postinoculation process, i.e., changing environmental conditions at an appropriate timing for each event, may be a promising approach to obtain a high yield. Detailed descriptions of plant growth conditions and careful examination of environmental effects will significantly contribute to our knowledge to stably obtain high recombinant protein content and productivity, thus enhancing the utility of plant-based transient expression systems as recombinant protein factories.

Keywords: plant molecular farming, upstream production, plant biomass, plant morphology, T-DNA transfer, plant host stress responses

# INTRODUCTION

Plants offer several benefits for recombinant protein production, including the potential for low-cost and large-scale biomass production, the low risk of contamination with human pathogens or toxins, and the ability to perform posttranslational protein modifications (Fischer et al., 2013). Transient expression systems using *Agrobacterium tumefaciens* binary vectors enable rapid production of milligram to gram quantities of recombinant proteins in whole leafy plants, such as *Nicotiana* species (D'Aoust et al., 2010; Matoba et al., 2011; Whaley et al., 2011; Gleba et al., 2014). Pharmaceutical-grade plant-based recombinant proteins can be produced at a commercial scale in closed facilities enabling good manufacturing practice-compliant processes (Fischer et al., 2012; Warzecha, 2012). Given these benefits, transient expression systems have opened a new avenue for the production of niche biopharmaceuticals and low-cost enzymes, such as individualized vaccines (McCormick et al., 2008), emerging disease vaccines (Shoji et al., 2009; D'Aoust et al., 2010), industrial enzymes (Hwang et al., 2012), and research reagents (Pogue et al., 2010). However, transient expression systems have various challenges that are inherent to the use of whole plants. These include batchto-batch variation of recombinant protein yields (Fischer et al., 2012; Wilken and Nikolov, 2012; Twyman et al., 2013) and unique processes to ensure effective protein extraction, recovery, and purification (Matoba et al., 2011; Buyel et al., 2015).

It is also important to achieve cost-effective upstream production and downstream processing for economic feasibility. In this context, two important yield parameters at the end of upstream production should be considered: (1) recombinant protein content per unit harvested biomass (unit: g g<sup>−</sup><sup>1</sup> ) and (2) recombinant protein productivity per unit area–time (unit: g m<sup>−</sup><sup>2</sup> /week or month). Higher recombinant protein content is expected to reduce the cost of the downstream processing because a smaller amount of raw biomass would reduce the use of downstream consumables to obtain a given recombinant protein amount (Buyel and Fischer, 2012; Tusé et al., 2014). Meanwhile, recombinant protein productivity per unit area–time is calculated by multiplying recombinant protein content per unit harvested biomass and biomass productivity per unit area–time. Higher recombinant protein productivity can save plant growth area required to obtain a given recombinant protein amount in a given time period. This is beneficial, particularly in closed facilities with a limited batch production scale available compared to open-field production. Collectively, factors that affect recombinant protein content and productivity need to be controlled and optimized. In practice, such factors include genetic (e.g., promoter selection and codon optimization), epigenetic [e.g., posttranscriptional gene silencing suppressor, such as tomato bushy stunt virus p19 (Voinnet et al., 2003)], and environmental factors (e.g., temperature), as well as factors controlling protein stability (e.g., physiochemical/biochemical characteristics and subcellular localization) (Twyman et al., 2013). Compared to the genetic and epigenetic factors and the protein stability, less attention has been paid to the importance of environmental factors on recombinant protein content and productivity. In this review, therefore, we specifically focus on the effects of environmental factors during the upstream production on recombinant protein content and productivity in *Agrobacterium*-mediated transient expression systems.

The upstream production process may further be divided temporally and spatially into two segments with vector inoculation in between (Wirz et al., 2012; Klimyuk et al., 2014; Holtz et al., 2015): a preinoculation process for 4–5 weeks and a postinoculation process for 1–2 weeks. Following the preinoculation growth process, plants are inoculated with transgene vectors, which lead to recombinant protein accumulation during the postinoculation process. Although environmental conditions should be controlled throughout the upstream production process to obtain high recombinant protein content and productivity, optimal conditions are likely different between pre- and postinoculation processes, which may also be different from well-studied environment control for high biomass productivity per unit area–time. Environmental factors in the preinoculation process affect plant characteristics such as biomass and morphology. These plant characteristics can affect recombinant protein content and productivity. On the other hand, environmental conditions in the postinoculation process, where transgene expression takes place in plants, should be controlled for recombinant protein accumulation rather than plant growth. Accordingly, we first discuss preinoculation environmental effects on recombinant protein content and productivity in the context of plant characteristics. Then, we discuss postinoculation environmental effects on recombinant protein content and productivity from biological aspects involved in protein biosynthesis and metabolism.

# EFFECTS OF PREINOCULATION ENVIRONMENTAL FACTORS

Plant morphology, such as the ratio of leaf weight to stem weight (leaf:stem ratio), at the time of vector inoculation can potentially affect recombinant protein content per unit harvested biomass. Stems and petioles remain uninfected if plants were infiltrated with an *A. tumefaciens* harboring a deconstructed viral vector (Gleba et al., 2014). In such cases, stems are expected to contain less recombinant protein per unit biomass than leaves. Increasing a leaf:stem ratio of plants can thus be beneficial for higher recombinant protein content per unit biomass when harvesting whole plant shoots (leaves and stems).

A lower planting density, for example, can lead to higher recombinant protein content per unit harvested biomass than a high planting density. This is because a lower planting density leads to a higher leaf:stem ratio (Poorter et al., 2012b), although it also leads to lower biomass productivity per unit area–time. Therefore, an optimal planting density should be carefully examined, considering its effect on a leaf:stem ratio and resultant recombinant protein content as well as productivity. Light environment control using artificial light sources is also important for the morphology. Using LEDs, Norikane (2015) demonstrated that *Nicotiana benthamiana* plants grown at a ratio of red-light photon flux density (PFD) to far-red-light PFD (R:FR ratio) of 0.7 had a lower leaf:stem ratio than those grown at an R:FR ratio of 1.2. Hence, a light source for the preinoculation plant growth should be carefully selected, considering its effects on a leaf:stem ratio and recombinant protein content and productivity.

We have demonstrated that optimal environmental conditions for high recombinant protein content can be different from those for high biomass productivity. When *N. benthamiana* plants were grown with nutrient solution at a higher nitrate concentration of 60 mmol L<sup>−</sup><sup>1</sup> for 2 weeks before vector inoculation, recombinant hemagglutinin content per unit leaf biomass was 1.4 times higher than that obtained from plants grown with 12 mmol L<sup>−</sup><sup>1</sup> nitrate (Fujiuchi et al., 2014). However, the higher nitrate concentration also led to the lower leaf weight per plant, which has offset the higher recombinant hemagglutinin content per leaf biomass and thus resulted in comparable recombinant hemagglutinin accumulation per plant (and hence comparable productivity per unit area–time as well) between the two nitrate concentrations. This result illustrates how a single growth factor can distinctively affect the two aforementioned yield parameters, and also point to the possibility that effects of multiple environmental factors should be evaluated for high recombinant protein content and productivity.

Despite the likely impacts of preinoculation environmental conditions on recombinant protein content and productivity, few studies have addressed the issue in detail, leaving an inconclusive assessment on this subject. More studies on preinoculation environmental effects are therefore warranted.

#### EFFECTS OF POSTINOCULATION ENVIRONMENTAL FACTORS

Recombinant protein content per unit leaf biomass can be significantly altered by postinoculation environmental factors, including temperature, light intensity, and humidity. Controlling these environmental factors and maintaining appropriate environmental conditions during the postinoculation process would therefore aid in obtaining high recombinant protein content. In the subsequent sections, we summarize our current knowledge on the impacts of temperature, light intensity, and humidity during the postinoculation process on recombinant protein content.

In *Agrobacterium*-mediated transient expression systems, recombinant proteins are accumulated as a result of a series of biological events in a plant. These include T-DNA transfer from *A. tumefaciens* to the plant nuclei, transcription, translation, recombinant protein folding, assembly, maturation, and subcellular localization. In addition, plant host stress responses and protein degradation are also involved in the overall yield. Among these, we shed light on T-DNA transfer and stress responses while discussing the effects of postinoculation environmental factors, since evidence exists for the contribution of these two events to the variation of recombinant protein content at different temperatures. Given that each event has its own specific timing of effect, dynamic environment control, i.e., changing environmental conditions at an appropriate timing for each event, might help increase recombinant protein content. An example of the effectiveness of such dynamic environment control is also presented.

#### Temperature

Several studies have examined the temperature dependence of recombinant protein accumulation by exposing a whole plant, a detached leaf or calli to a constant temperature during several days after infiltration or coculture with *A. tumefaciens* (**Table 1**). Buyel and Fischer (2012) and Buyel (2013) demonstrated the temperature dependence of the content of 2G12 (a human monoclonal anti-HIV immunoglobulin G antibody) and DsRed (a red fluorescent protein) transiently expressed in young leaf tissue of *Nicotiana tabacum*. At 6 days post inoculation (dpi), the 2G12 content at an air temperature of 21°C was 5- to 10-fold higher than those at 15 and 30°C, while the DsRed content at 25°C was considerably higher than those at 15 and 30°C. Dillen et al. (1997) demonstrated the temperature dependence of transiently expressed β-glucuronidase (GUS) in *N. tabacum* leaves, in which GUS staining of leaves at 19 and 22°C was greater than that at 15 and 25°C, while no staining was observed at 29°C, at 3 dpi. These studies, along with several others on the temperature effect on recombinant protein accumulation in leaves or callus (Kondo et al., 2000; De Clercq et al., 2002; Joh et al., 2005; Cazzonelli and Velten, 2006; Yasmin and Debener, 2010; Matsuda et al., 2012; Moon et al., 2014), suggest that recombinant protein content upon *Agrobacterium*-mediated transient expression generally has its peak at a temperature between 15 and 25°C. Although such an effect may depend on protein and/or plant species, the recombinant protein content appears to decrease sharply when the temperature becomes approximately 5°C higher or lower than the optimal temperature, indicating that temperature is an important environmental factor for high recombinant protein content. Also, these studies point to an important engineering control principle for plant-based protein factories, whereby temperature monitoring and control for a high degree of spatiotemporal uniformity in the plant growth space are necessary during the postinoculation process to avoid unexpected decrease in recombinant protein content.

The temperature dependence of recombinant protein content may be related to the efficiency of T-DNA transfer and plant host stress responses. The optimal temperature for T-DNA transfer from *A. tumefaciens* is reportedly 18–23°C (Atmakuri and Christie, 2008). Fullner and Nester (1996) assessed the temperature effect on T-DNA transfer. The transfer efficiency at 19°C was higher than that at 15°C, while those at 28 and 31°C were almost undetectable. This indicates that the efficiency of T-DNA transfer appears to have the temperature dependence similar to that of recombinant protein content. On the other hand, plant host stress responses can be induced at high temperatures, possibly causing leaf necrosis. The transient overexpression of the anti-HIV lectin actinohivin in the cytosol using *Agrobacterium*-mediated viral vector induced severe necrosis in *N. benthamiana* leaves at 27°C, but not at 22°C, at 4–5 dpi (Matoba et al., 2010). Similarly, we have observed necrosis in *N. benthamiana* leaves upon exposure to a postinoculation air temperature of 25°C, which was associated with low recombinant endoplasmic reticulum (ER)-targeted hemagglutinin content, while no necrosis with high hemagglutinin content was observed at 20°C (Matsuda et al., 2012). Necrosis during transient expression may be caused by a hypersensitive response and ER stress, which are likely induced by delivery of an *Agrobacterium*-mediated viral vector and an excessive abundance of misfolded proteins, respectively (Hamorsky et al., 2015). Collectively, host stress responses and necrosis induced at the higher temperature appear to be responsible, at least partly, for the lower recombinant protein content at the higher temperature.

In the postinoculation process in large-scale protein production facilities, static environmental conditions, where environmental variables are maintained almost constant, seem to be employed (Wirz et al., 2012; Holtz et al., 2015). However, air temperature may not necessarily have to be maintained constant. Jung et al. (2015) recently explored the effects of two-phase air temperature control. They observed fivefold higher content of recombinant xylanase in detached sunflower leaves treated at 20°C for the first 4 days postinoculation followed by 27°C for 2 days than that of leaves exposed to a constant temperature of 20°C. Considering that T-DNA transfer is generally completed within a single day after *A. tumefaciens* attachment to plant cells (Virts and Gelvin, 1985; Sykes and Matthysse, 1986), it may be appropriate to treat plants at a relatively low temperature, which is suitable to T-DNA transfer, during the early period of the postinoculation process. It may then be better to shift the temperature to a higher level optimal for subsequent recombinant protein accumulation and be maintained at the level for the remaining period. The result of Jung et al. (2015) illustrates the usefulness of the dynamic control of air temperature as a strategy to improve recombinant protein accumulation.

#### Light Intensity

Several studies have examined the effect of light intensity during the postinoculation process on recombinant protein content (**Table 1**). There are reports indicating that exposure to light is essential. For example, Cazzonelli and Velten (2006) observed that the recombinant luciferase content in *N. tabacum* leaves treated in the dark was 20% of that treated at a photosynthetic PFD (PPFD) of 80–100 μmol m<sup>−</sup><sup>2</sup> s<sup>−</sup><sup>1</sup> . Similarly, De Clercq et al. (2002) demonstrated that recombinant GUS was highly expressed in tepary bean callus at a PPFD of 20 μmol m<sup>−</sup><sup>2</sup> s<sup>−</sup><sup>1</sup> , while almost no expression was detected in those treated in the dark. Zambre et al. (2003) supported the findings of De Clercq et al. (2002) and concluded that the negative effect of the dark treatment on transient expression was attributable to neither transcription nor posttranscriptional events but was associated with lower T-DNA transfer efficiency.

In contrast, some studies have shown that exposure to light have no positive effect on recombinant protein accumulation. McDonald et al. (2014) demonstrated that the abundance of recombinant endoglucanase in detached *N. benthamiana* leaves treated in the dark was comparable to that in leaves treated in the light. Similar results were reported for detached lettuce leaves (Joh et al., 2005), *N. tabacum* seedlings after wounding and spraying with *A. tumefaciens* suspension (Escudero and Hohn, 1997) and *N. benthamiana* cells cocultured with *A. tumefaciens* (Larsen, 2011).

Some researchers have investigated the effect of light intensity on recombinant protein content. We reported that there was no significant difference in recombinant hemagglutinin content between *N. benthamiana* plants treated at PPFDs of 100 and 300 μmol m<sup>−</sup><sup>2</sup> s<sup>−</sup><sup>1</sup> (Matsuda et al., 2012). Patil and Fauquet (2015) also reported no significant effect of PPFD on transiently expressed recombinant green fluorescent protein content in *N. benthamiana*.

Taken together, although there still is room for investigation on the necessity of lighting during the postinoculation process, one


*a Not available.* conclusion at present may be that exposure to light does not negatively affect recombinant protein accumulation. Even if exposure to light is essential, the intensity may not have to be high during the postinoculation process for recombinant protein accumulation.

#### Humidity

There is limited information currently available about the effects of humidity on transient protein production in plants, as few studies have focused on this subject thus far (**Table 1**). Nevertheless, we have recently shown that controlling postinoculation humidity is particularly important for the detached leaf-based transient expression in *N. benthamiana*. When detached leaves are used as an expression platform, the leaves must be treated at low relative humidity for a few hours immediately after infiltration with *A. tumefaciens* suspension to remove the water occupying the intercellular space. The leaves are then incubated until harvest at high relative humidity to prevent wilting (Plesha, 2008). In this detached leaf system, we found that the removal of residual bacterial suspension water immediately after infiltration is the key factor for recombinant protein content (Fujiuchi et al., 2016). Recombinant hemagglutinin content was dramatically increased upon almost complete removal of suspension water occupying the leaf intercellular space by placing leaves at low relative humidity (approximately 25%). A similar effect might be observed when whole *N. benthamiana* plants were incubated at a high relative humidity during postinoculation process, which should be examined in future studies. Thus, the effectiveness of humidity control during the postinoculation process is worth being evaluated in whole plant-based systems.

# CONCLUSION AND FUTURE DIRECTIONS

There is growing evidence that optimization of environmental conditions is crucial for efficient recombinant protein production in *Agrobacterium*-mediated transient expression systems. Given the significant effects of the environment on recombinant protein content, we encourage researchers in this research field to provide detailed documentation of upstream environmental conditions in their publications. For example, given the significance of postinoculation environmental factors, at least air temperature, PPFD, and relative humidity should be measured and described. Information on plant age and position of harvested leaves for protein quantification is also useful. In addition, documentations of planting

#### REFERENCES


density, light quality, and nutrient supply during preinoculation process are recommended. To optimize preinoculation environmental conditions, further investigations of their effects on plant morphology, i.e., biomass distribution between leaves and stem and resultant recombinant protein content and productivity are necessary. Other factors to be considered in general plant experiments are described by Poorter et al. (2012a). In addition to assessing pre- and postinoculation environmental effects individually, further studies should investigate interactive effects of multiple environmental factors to identify optimal environmental conditions for each protein of interest. Given the inherent variability of transient expression systems using whole plants, a careful experimental design encompassing a large sample size with a rigorous statistical analysis is critical to reveal the impacts of environmental factors. For production using a large number of plants in a growth facility, attention should be paid to the possible variability of each environmental factor within the plant growth space, since it may cause inconsistent recombinant protein content in individual plants and compromise batch-to-batch reproducibility. Ultimately, close monitoring and tight control of spatiotemporal environment variation will be critical to stably obtain high recombinant protein content and productivity and enhance the utility of plant-based transient expression systems as recombinant protein factories.

## AUTHOR CONTRIBUTIONS

All authors conceived and designed the review. NF collated the literature. NF and RM wrote the manuscript. NM and RM critically revised the manuscript. All authors approved the final manuscript and agreed to be accountable for all aspects of the manuscript.

#### ACKNOWLEDGMENTS

We thank Professors Kazuhiro Fujiwara (University of Tokyo) and Chieri Kubota (University of Arizona) for valuable discussion and encouragement. We also thank Adam Husk (University of Louisville Owensboro Cancer Research Program) for proofreading the manuscript. This work was financially supported in part by the Science and Technology Research Promotion Program for Agriculture, Forestry, Fisheries, and Food Industry (No. 25025A) of the Ministry of Agriculture, Forestry, and Fisheries, Japan to RM, and by JSPS KAKENHI Grant Number 26712021 to RM.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Fujiuchi, Matoba and Matsuda. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Tackling Unwanted Proteolysis in Plant Production Hosts Used for Molecular Farming

#### Manoj K. Mandal, Houtan Ahvari, Stefan Schillberg and Andreas Schiermeyer\*

*Department of Plant Biotechnology, Fraunhofer Institute for Molecular Biology and Applied Ecology, Aachen, Germany*

Although the field of molecular farming has significantly matured over the last years, some obstacles still need to be resolved. A major limiting factor for a broader application of plant hosts for the production of valuable recombinant proteins is the low yield of intact recombinant proteins. These low yields are at least in part due to the action of endogenous plant proteases on the foreign recombinant proteins. This mini review will present the current knowledge of the proteolytic enzymes involved in the degradation of different target proteins and strategies that are applied to suppress undesirable proteolytic activities in order to safeguard recombinant proteins during the production process.

#### Edited by:

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### Reviewed by:

*Frank Sainsbury, The University of Queensland, Australia Philippe V. Jutras, Laval University, Canada Mathew Paul, St George's University of London, UK*

\*Correspondence:

*Andreas Schiermeyer andreas.schiermeyer@ime.fraunhofer.de*

#### Specialty section:

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

Received: *14 December 2015* Accepted: *19 February 2016* Published: *08 March 2016*

#### Citation:

*Mandal MK, Ahvari H, Schillberg S and Schiermeyer A (2016) Tackling Unwanted Proteolysis in Plant Production Hosts Used for Molecular Farming. Front. Plant Sci. 7:267. doi: 10.3389/fpls.2016.00267* Keywords: antibodies, biopharmaceuticals, degradation, protease inhibitors, proteases, recombinant proteins, tobacco

# INTRODUCTION

As the field of plant molecular farming has evolved over the last two decades, many obstacles have been overcome, leading to the first approval of a biopharmaceutical protein for human therapy in 2012 and additional candidates being evaluated in clinical trials (Paul and Ma, 2011). Although plant cells have been successfully engineered to humanize the N-glycan modification of recombinant proteins (Castilho and Steinkellner, 2012) and different protocols have been developed for cGMP compliant production (Fischer et al., 2012), an important issue limiting the broader adoption of plant molecular farming remains: the relatively low yield of recombinant proteins. Plant cells, especially the lytic vacuole and the apoplast, are rich in proteolytic enzymes of diverse classes (Goulet et al., 2012). Interestingly, the first approved biopharmaceutical protein made from plant cells is a lysosomal acidic beta-glucocerebrosidase, taliglucerase alfa, a human enzyme that is used in enzyme replacement therapy for Gaucher patients (Shaaltiel et al., 2007) and has evolved to withstand the harsh hydrolytic environment of the lysosome. Other target molecules like full-size IgG antibodies have frequently been reported to suffer from proteolytic degradation (Donini et al., 2015) irrespective of the plant system that has been used for production. In addition to reduced yields of the target protein, proteolytic processing might also lead to the formation of degradation products that have very similar physico-chemical properties as the intact target protein and are therefore difficult to remove during downstream processing. Several strategies are currently evaluated to safeguard the target protein against degradation that should ultimately lead to the development of improved plant host systems.

**Abbreviations:** ACT, antichymotrypsin; CDI, cathepsin D inhibitor; DFP, diisopropylfluorophosphate; DSPA, Desmodus rotundus plasminogen activator; ERT, enzyme replacement therapy; Fab, fragment antigen-binding; Fc, fragment crystallizable; GM-CSF, granulocyte-macrophage colony-stimulating factor; HC, heavy chain; Ig, immunoglobuline; IL-10, interleukin-10; mAb, monoclonal antibody; PLCP, papain-like cysteine protease; PMSF, phenylmethanesulfonyl fluoride; PVP, polyvinylpyrrolidone.

#### PROTEOLYTIC DEGRADATION OF RECOMBINANT PROTEINS

Many candidate biopharmaceutical proteins such as plasminogen activators (Schiermeyer et al., 2005), cytokines (Sirko et al., 2011), human serum albumin (Sun et al., 2011), and monoclonal antibodies (Stevens et al., 2000; Sharp and Doran, 2001; Muynck et al., 2009) have been shown to undergo proteolytic processing to different degrees when they are produced in plant cells. The following section presents examples of the proteolytic degradation of recombinant proteins produced in plant cells and the proteolytic enzymes involved.

Monoclonal antibodies currently represent the largest class of biopharmaceuticals and thus also represent attractive target molecules for plant production platforms. However, plantproduced full-length antibodies often show degradation of their heavy chains, whereas the light chains usually remain intact. It has been known for a long time that plant cysteine proteases of the papain family are able to cleave immunoglobulins within the hinge region of their heavy chains to yield Fab and Fc fragments (Porter, 1959). Other proteases cleave immunoglobulins in the same region but at slightly different sites (Gorevic et al., 1985), which indicates that the cleavage depends not only on specific sequence recognition sites but also on the open and accessible conformation of the hinge region and, in some cases, other solvent-exposed loops. It is therefore not surprising that several research groups reported the processing of plant-produced recombinant IgG antibodies into Fc, Fab, F(ab')<sup>2</sup> and other cleavage products (Hehle et al., 2014). Approximately 90% of the heavy chain of the human H10 IgG1λ monoclonal antibody was cleaved inside the cells of tobacco plants (Villani et al., 2009). The murine IgG1 monoclonal antibody MGR48 was cleaved in the hinge region under acidic conditions when it was spiked into crude leaf extracts from Nicotiana tabacum (Stevens et al., 2000). The authors of that study also noted that the proteolytic activity was higher in older leaves than in younger leaves. A systematic analysis of the murine antibody (IgG1) Guy's 13 produced in the tobacco production systems hairy roots, shooty teratoma, and suspension cells indicated that similar degradation products could be identified in all systems (Sharp and Doran, 2001). That study also established that the proteolytic processing occurs along the secretory pathway of the cell and in the apoplast. Similarly, degradation products of the chimeric human/rat IgG1κ LO-BM2 antibody heavy chain were identified in the intercellular wash fluid of transgenic N. tabacum plants and the spent cell culture medium of transgenic tobacco BY-2 suspension cells (Muynck et al., 2009). Most reports on the production of immunoglobulins in plants have been focused on the IgG1 isotype. However, for certain applications, other isotypes might also be of interest (Salfeld, 2007). A recent publication therefore compared the stability of human IgG1, IgG2, and IgG4 monoclonal antibodies in the spent culture medium of tobacco BY-2 suspension cells (Magy et al., 2014). This analysis revealed a significantly higher accumulation of the IgG1 isotype in the culture medium (10 mg/L) compared with the IgG2 (5.4 mg/L) and IgG4 (0.9 mg/L) isotypes. However, when the same set of antibodies was expressed in Arabidopsis thaliana suspension cells, no significant differences in accumulation were recognized. The accumulation of all isotypes was approximately 3 mg/L in the culture medium.

Because plant genomes encode several hundred proteolytic enzymes (van der Hoorn, 2008), it is challenging to identify the protease(s) that are responsible for the degradation of a given recombinant protein. It has been demonstrated that the proteolytic processing of the heavy chain of the human (IgG1κ) anti-HIV antibody 2F5 was effectively inhibited by phenylmethanesulfonyl fluoride (PMSF) or diisopropylfluorophosphate (DFP), two irreversible inhibitors of serine proteases (Mandal et al., 2014; Niemer et al., 2014). Similarly, it has been shown that the degradation of human IgG3 antibodies spiked into spent culture medium from tobacco BY-2 cells and other recombinant proteins, such as human α1-antitrypsin or BSA, spiked into the intercellular washing fluid of tobacco plants was partially inhibited by the addition of PMSF (Delannoy et al., 2008; Navarre et al., 2012; Castilho et al., 2014).

Because most pharmaceutical proteins are glycoproteins, their recombinant counterparts are targeted to the secretory route to obtain the desired glycan modification in the ER, Golgi apparatus and downstream compartments. Therefore, knowledge of secreted proteases and those residing in cell compartments along the secretory pathway is of critical importance to develop suitable strategies for the stabilization of recombinant proteins. Mass spectrometry based secretome analysis of tobacco BY-2 spent culture medium (Navarre et al., 2012), hydroponic culture medium of tobacco plants (Madeira et al., 2016; Wendlandt et al., 2016) and intercellular washing fluid of N. benthamiana leaves (Goulet et al., 2010a) revealed the presence of subtilisinlike proteases, serine carboxypeptidases, papain-like cysteine proteases (PLCP) and homologs of the CND41 aspartic protease belonging to the S8, S10, C1 and the A1 family of proteases according to the MEROPS classification (Rawlings et al., 2012). A proteomic survey of the spent culture medium from rice cells revealed the secretion of PLCPs, EP3A, and Rep-1 into the culture medium (Kim et al., 2008a). A specific member of the PLCP family, CysP6 from N. tabacum, has been implicated in the degradation of recombinant human interleukin-10 (IL-10) within the endoplasmic reticulum of the cell (Duwadi et al., 2015). A legumain-like cysteine protease belonging to the C13 family according to the MEROPS classification was most likely responsible for the degradation of recombinant equistatin produced in the leaf tissue of Solanum tuberosum (Outchkourov et al., 2003). The degradation of a recombinant plasminogen activator (DSPAα1) produced in tobacco cells has been shown to be reduced in the presence of EDTA, indicating the involvement of a matrix-metalloprotease in the degradation of DSPAα1 (Schiermeyer et al., 2005; Mandal et al., 2010). In vitro studies using recombinant proteolytic enzymes confirmed that two serine proteases, subtilisin (S8 family) and chymotrypsin (S1 family), and two PLCPs (C1 family), cathepsin B and cathepsin L, were able to cleave the 2F5 antibody HC within its CDR-H3 domain (Niemer et al., 2014).

#### STRATEGIES TO COMBAT PROTEOLYSIS

During the past two decades various strategies have been developed and tested to reduce the proteolytic activity in a variety of plant expression systems to increase accumulation levels of recombinant biopharmaceuticals. The following sections describe these efforts in more detail and an overview of the different approaches to reduce the proteolytic activity in plant tissue and cell cultures is provided in **Table 1**.

#### SUPPLEMENTATION OF STABILIZING AGENTS

The use of suspension cultures for plant cells and organs to produce recombinant proteins enables the addition of protein-stabilizing agents to the culture medium. It has been demonstrated that the heavy chain of a murine IgG1 antibody could be stabilized in the spent culture medium of tobacco hairy roots by the addition of gelatin or polyvinylpyrrolidone (PVP), thereby increasing the production level up to nine-fold (Wongsamuth and Doran, 1997). Similarly the accumulation level of the anti-vitronectin human IgG1λ mAb M12 in the culture medium of hairy roots increased twofold when PVP was added to the culture medium (Häkkinen et al., 2014). The addition of PVP to the culture medium of transgenic tobacco NT-1 cells expressing a murine IgG1 antibody led even to a 35-fold increase in antibody heavy chain accumulation in the culture medium (LaCount et al., 1997). By supplementing the culture medium of transgenic tobacco NT-1 cells with bovine serum albumin, a two-fold increase in the accumulation of extracellular human GM-CSF was achieved (James et al., 2000). Likewise, the addition of human serum albumin to the culture medium of transgenic moss (Physcomitrella patens) cells expressing the human vascular endothelial growth factor enhanced its production levels three-fold (Baur et al., 2005). Whether the above-mentioned substances exert an inhibitory effect on proteolytic enzymes has not yet been analyzed, but the proteinaceous substances might act as an alternative substrate for extracellular proteases, thereby stabilizing the protein of interest.

#### CO-EXPRESSION OF PROTEASE INHIBITORS

Different strategies have been tested to reduce the unwanted proteolysis of recombinant proteins in plant cells, such as the co-expression of protease inhibitors together with the protein of interest. In particular, protease inhibitors with specificity for cysteine, serine or aspartic proteases have been deployed for this purpose. It has been reported that co-expression of the Kunitz-type (I3 family according to the MEROPS classification) cathepsin D inhibitor (SlCDI) from tomatoes with human α1 antichymotrypsin (α1-ACT) stabilizes the latter and leads to a 2.5-fold increase in its accumulation in potato leaves (Goulet et al., 2010b). In a follow-up study, the transient expression of SlCDI or tomato cystatin SlCYS9 (I25 family) was investigated for the potential to stabilize the murine C5-1 IgG monoclonal antibody in N. benthamiana. Whereas the expression of both inhibitors led to increased accumulation of the antibody light chain, higher production of the heavy chain could only be documented by the co-expression of SlCDI (Goulet et al., 2012). In a recent report, another member of the tomato cystatin family, SlCYS8, was used for transient co-expression with the C5-1 antibody in N. benthamiana. The accumulation of the C5- 1 antibody increased approximately 40% on the whole plant scale. However, it has been recognized that the stabilizing effect of this cystatin is confined to the younger leaves of the plant. In older leaves, the SlCYS8 levels were considerably lower and increased PLCPs activity has been documented in these leaves (Robert et al., 2013). Constitutive expression of the rice cysteine protease inhibitor oryzacystatin-I resulted in an increased accumulation and higher activity of the model protein glutathione reductase in tobacco plants compared with non-transgenic controls (Pillay et al., 2012). The co-expression of a synthetic construct containing trypsin and chymotrypsin inhibitor domains from the proteinase inhibitor II (I20 family) gene from N. alata with recombinant human granulocytemacrophage colony-stimulating factor (hGM-CSF) led to a twofold increase in the accumulation of secreted hGM-CSF in rice suspension cultures (Kim et al., 2008b). Co-secretion of the soybean Bowman-Birk serine protease inhibitor (I12 family) together with recombinant human single-chain IgG1 or fullsize IgG4 antibodies from the roots of transgenic tobacco plants increased the accumulation of the antibodies 2- to 2.5-fold (Komarnytsky et al., 2006).

#### GENE KNOCKDOWN

Another promising strategy to reduce the proteolytic degradation of recombinant proteins is to knockdown the expression of protease-encoding genes. This strategy was introduced by Kim et al. (2008a) to suppress the expression of the cysteine protease gene Rep-1 by RNAi in rice suspension cells to improve the production of hGM-CSF. In that report, the authors used the rice amylase 3D promoter to drive the expression of hGM-CSF upon induction by sugar starvation. However, sugar starvation resulted in an accumulation of a cysteine protease from the C1A family encoded by the Rep-1 gene. The expression of Rep-1 was suppressed by post-transcriptional gene silencing using an intron-containing self-complementary hairpin RNA (ihpRNA) construct specific for Rep-1. This strategy resulted in a twofold higher accumulation of hGM-CSF in the rice cell culture medium compared with its expression in a non-silenced cell line. A similar approach has been used for intact plants, where the RNAi-mediated silencing of another cysteine protease-encoding gene, CysP6, improved the accumulation of recombinant human IL-10 in tobacco leaves by approximately 1.6-fold (Duwadi et al., 2015). A somewhat broader protease silencing approach was followed by simultaneous silencing of four protease genes (NtAP, NtCP, NtMMP1, and NtSP) coding for proteases from four catalytic classes (aspartic, cysteine, metallo- and serine proteases) through the expression of the corresponding antisense sequences in tobacco BY-2 cells (Mandal et al., 2014). The study showed TABLE 1 | Strategies to reduce proteolysis in plant tissues and cell cultures.


that the culture medium of the antisense RNA-expressing BY-2 cells had a lower level of total proteolytic activity than did wildtype BY-2 cells. When this transgenic BY-2 cell line was used to produce a recombinant full-length IgG1κ antibody, 2F5, it resulted in a four-fold higher accumulation of the intact antibody heavy chain compared with wild type cells expressing the same antibody.

With the introduction of different gene targeting strategies based on sequence-specific nucleases, it is now possible to disrupt any protease gene to completely knockout its activity (Fichtner et al., 2014). Although this technology has not yet been applied to protease genes, targeting has been used to knockout two α(1,3)-fucosyltransferases and two β(1,2)-xylosyltransferases in N. benthamiana to engineer plants that are devoid of plantspecific N-glycosylation patterns (Li et al., 2016). It is therefore only a matter of time before this technology will be applied to generate plants in which specific protease genes will be disrupted.

#### SUBCELLULAR TARGETING AND FUSION PROTEINS

To protect recombinant proteins from degradation in the apoplast or vacuole, targeting strategies have been developed to sequester the target protein from these hydrolytic cellular compartments. The retention of recombinant proteins in the ER or ER-derived structures has been proven to be particularly beneficial (Conrad and Fiedler, 1998). As plant seeds have evolved to store proteins in large quantities, a fusion strategy using the maize seed storage protein, γ-zein, has been developed. When the N-terminal Zera (γ-zein ER-accumulating) domain was fused to target proteins, the proteins accumulated in ERderived protein bodies. The fusion of the Zera domain with the subunit vaccine F1-V from Yersinia pestis led to the formation of protein bodies and a three-fold higher accumulation of the fusion protein in tobacco NT-1 suspension cells compared to F1-V alone

(Alvarez et al., 2010). Similar results were obtained in transiently transformed N. benthamiana and stably transformed alfalfa (Medicago sativa) plants. Likewise, the fusion of target proteins with hydrophobin I from Trichoderma reesei facilitated the formation of protein bodies in N. tabacum and N. benthamiana and increased the product yields (Joensuu et al., 2010; Gutiérrez et al., 2013).

In addition to the co-expression strategy with protease inhibitors described above, the tomato cystatin SlCYS8 has been used to produce a fusion protein with human α1-ACT. This fusion protein accumulated at up to 25-fold higher levels compared with free α1-ACT (Sainsbury et al., 2013). However, in this case, the stabilizing effect was shown to be independent of the inhibition of cysteine proteases, as a fusion with a mutant, inactive, SlCYS8 protein also displayed a similar stabilizing activity. The authors therefore speculated that the fusion with SlCYS8 stabilizes the tertiary structure of α1-ACT and thereby prevents its attack by hydrolytic enzymes.

#### REFERENCES


Based on the above-discussed strategies that can be used to prevent the proteolytic degradation of recombinant proteins, it is clear that there is no "magic bullet" that can stabilize all target proteins. Instead, an individual strategy has to be devised for each recombinant protein of interest. However, the tools described above should provide a suitable selection of procedures to tackle this important issue.

#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

We gratefully acknowledge the financial support provided by the Federal Ministry of Education and Research (BMBF) within the ERA-IB 3 program (PRODuCE project, 031A219A).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Mandal, Ahvari, Schillberg and Schiermeyer. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# 5 and 3**-** Untranslated Regions Strongly Enhance Performance of Geminiviral Replicons in *Nicotiana benthamiana* Leaves

#### *Andrew G. Diamos, Sun H. Rosenthal and Hugh S. Mason\**

*Center for Infectious Diseases and Vaccinology, Biodesign Institute, and School of Life Sciences, Arizona State University, Tempe, AZ, USA*

#### *Edited by:*

*Edward Rybicki, University of Cape Town, South Africa*

#### *Reviewed by:*

*F. Murilo Zerbini, Universidade Federal de Viçosa, Brazil Guy L. Regnard, University of Cape Town, South Africa*

> *\*Correspondence: Hugh S. Mason hugh.mason@asu.edu*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 15 November 2015 Accepted: 05 February 2016 Published: 24 February 2016*

#### *Citation:*

*Diamos AG, Rosenthal SH and Mason HS (2016) 5 and 3*- *Untranslated Regions Strongly Enhance Performance of Geminiviral Replicons in Nicotiana benthamiana Leaves. Front. Plant Sci. 7:200. doi: 10.3389/fpls.2016.00200*

We previously reported a recombinant protein production system based on a geminivirus replicon that yields high levels of vaccine antigens and monoclonal antibodies in plants. The bean yellow dwarf virus (BeYDV) replicon generates massive amounts of DNA copies, which engage the plant transcription machinery. However, we noticed a disparity between transcript level and protein production, suggesting that mRNAs could be more efficiently utilized. In this study, we systematically evaluated genetic elements from human, viral, and plant sources for their potential to improve the BeYDV system. The tobacco extensin terminator enhanced transcript accumulation and protein production compared to other commonly used terminators, indicating that efficient transcript processing plays an important role in recombinant protein production. Evaluation of human-derived 5 untranslated regions (UTRs) indicated that many provided high levels of protein production, supporting their cross-kingdom function. Among the viral 5- UTRs tested, we found the greatest enhancement with the tobacco mosaic virus omega leader. An analysis of the 5- UTRs from the *Arabidopsis thaliana* and *Nicotinana benthamiana* photosystem I K genes found that they were highly active when truncated to include only the near upstream region, providing a dramatic enhancement of transgene production that exceeded that of the tobacco mosaic virus omega leader. The tobacco Rb7 matrix attachment region inserted downstream from the gene of interest provided significant enhancement, which was correlated with a reduction in plant cell death. Evaluation of *Agrobacterium* strains found that EHA105 enhanced protein production and reduced cell death compared to LBA4301 and GV3101. We used these improvements to produce Norwalk virus capsid protein at *>*20% total soluble protein, corresponding to 1.8 mg/g leaf fresh weight, more than twice the highest level ever reported in a plant system. We also produced the monoclonal antibody rituximab at 1 mg/g leaf fresh weight.

Keywords: geminivirus, *Nicotiana benthamiana*, 5 untranslated regions, 3 untranslated regions, monoclonal antibody, transient expression, virus-like particle

# INTRODUCTION

Recombinant protein production systems have become an integral part of medicine, industry, and research. Biopharmaceutical proteins, including monoclonal antibodies, enzymes, growth factors, and other biologics, are the largest and fastest growing sector of all pharmaceuticals (Butler and Meneses-Acosta, 2012). Nearly all of these recombinant proteins are made with traditional bioreactors using mammalian, insect, or microbe cell cultures. In recent years, plant systems have been extensively explored as alternative expression systems that offer safety, cost-effectiveness, scalability (Huang et al., 2009; Thuenemann et al., 2013a; Klimyuk et al., 2014; Mortimer et al., 2015). The potential of plant-based systems has been demonstrated by the approval of the first plant-derived therapeutic for Gaucher's disease and by the advancement of many plant-made biologics to late-stage clinical development (Gleba et al., 2014). However, the economic feasibility of plantbased systems is strongly yield dependent, and thus, methods of increasing transgene expression are crucial for the success of plants as a recombinant protein production platform.

We and others have previously reported the potential for viral vectors based on bean yellow dwarf virus (BeYDV) to be used for biopharmaceutical production. In this system, the BeYDV replication elements are used to amplify the genes of interest to high copy number in the plant cell nucleus in the form of circular DNA replicons. These replicons utilize the nuclear transcription machinery, leading to the production of large amounts of recombinant protein that is dependent on replication (Huang et al., 2009, 2010; Regnard et al., 2010). Due to the noncompeting nature of BeYDV replicons, multiple proteins can be produced in the same cell from the same vector. This contrasts with many RNA virus systems, where the coinfiltration of two different vectors based on the same virus backbone results in one vector being preferentially amplified in a single cell, thus inhibiting the coproduction of multiple proteins in the same cell. This problem has been partially addressed by the identification of TMV and PVX as non-competing viruses for the production of proteins with two heterosubunits (Giritch et al., 2006), however, this system is incapable of producing proteins with more than two heterosubunits. In the BeYDV system, there is presently no known limit to the size or number of proteins that are capable of being efficiently produced (Chen et al., 2011). Additionally, the host range of BeYDV allows the use of these vectors in many dicot plant species, such as tobacco and lettuce (Lai et al., 2012). *Nicotiana* species are the most widely used plant for recombinant protein production due to susceptibility to virus infection, ease of vacuum infiltration, and high biomass (Gleba et al., 2014).

While BeYDV vectors strongly enhance gene expression at the level of transcription, replicon amplification greatly exceeds the enhancement of protein accumulation (Huang et al., 2009, 2010; Regnard et al., 2010). Moreover, for mRNA transcripts to be efficiently utilized, the interplay of multiple post-transcriptional cellular processes is required, many of which are controlled by the regions upstream and downstream of the gene coding sequence. The 5 untranslated regions (UTR) plays an important role in optimizing transgene production by competing with cellular transcripts for translation initiation factors and ribosomes, increasing mRNA half-life by minimizing mRNA decay or post-transcriptional gene silencing, and avoiding deleterious interactions with regulatory proteins or inhibitory RNA secondary structures (Chiba and Green, 2009; Moore and Proudfoot, 2009; Jackson et al., 2010).

The 5- UTR from the genomic RNA of tobacco mosaic virus, known as the omega leader, is one of the most well-studied enhancers of translation (Gallie and Walbot, 1992). Several other viral 5- UTRs have been found to greatly enhance transgene production in many plant systems, including those from alfalfa mosaic virus (AMV; Gehrke et al., 1983), tobacco etch virus (Carrington and Freed, 1990), and pea seadborne mosaic virus (Nicolaisen et al., 1992). Many RNA viruses, such as barley yellow dwarf virus (BYDV), also have 3- UTRs that contain 3- cap-independent translation enhancers, which enhance reporter production in tobacco and oat protoplasts (Fan et al., 2012). The combination of the 5 and 3- UTRs from cowpea mosaic virus improved protein production in transient expression assays using *Nicotiana benthamiana*, likely due to translational enhancement (Sainsbury et al., 2009). Two native plant 5- UTRs were identified that improved transgene production at levels comparable to viral 5- UTRs in transgenic cotton and tobacco (Agarwal et al., 2014). Additionally, a synthetic 5- UTR was also reported to enhance protein production at a level similar to the TMV 5- UTR in transgenic tobacco and cotton (Kanoria and Burma, 2012).

Genetic elements downstream from the gene of interest also play a crucial role in optimizing protein production. Proper transcript termination and polyadenylation are necessary for nuclear export, mRNA stability, efficient translation, and prevention of gene silencing (Luo and Chen, 2007; Moore and Proudfoot, 2009). Several terminators have been investigated for their potential to enhance protein production in plants. The 3- UTR from the potato *pinII* gene was found to enhance hepatitis B virus surface antigen 10–50 fold in transgenic potato compared to the *Agrobacterium*-derived nopaline synthase terminator (Richter et al., 2000). Combining the nopaline synthase terminator with the 35S terminator from cauliflower mosaic virus resulted in a 5–65 fold enhancement of yellow fluorescent protein production compared to the 35S terminator alone (Beyene et al., 2011).

Additionally, chromatin scaffold/matrix attachment regions (MARs) have been explored as genetic elements capable of enhancing transgene production in plant systems. MARs are ATrich regions thought to be involved in higher-order chromatin structure and that preferentially associate with nuclear matrix, a complex cellular structure with many proposed roles (Liebich et al., 2002; Calikowski et al., 2003; Halweg et al., 2005). Experiments in whole plants and plant cell cultures have shown that the presence of MARs can enhance transcription of flanking genes. The tobacco Rb7 MAR also increased the proportion of plant transformants expressing a transgene (Halweg et al., 2005). Furthermore, MARs have been implicated in the reduction of transgene silencing (Mlynarova et al., 2003). The tobacco TM6 MAR was shown to reduce repressive DNA methylation in flanking promoter regions and enhance recombinant protein production in transgenic tobacco (Ji et al., 2013). An expression vector based on a mild strain of BeYDV that contained the Rb7 MAR has been previously reported, though a comparison to a vector without the MAR was not made (Regnard et al., 2010). The ability for MARs to enhance protein production in transient expression systems has not been thoroughly investigated.

*Agrobacterium*-mediated T-DNA transfer (reviewed in (McCullen and Binns, 2006) is the preferred method of gene delivery in plant transient expression systems (Chen and Lai, 2015). However, *Agrobacterium* is a plant pathogen that has complex effects on infiltrated leaf tissues and often elicits a cell death response (Ditt et al., 2001; Veena et al., 2003). Many studies have found variable effects of different *Agrobacterium* strains, depending on the plant species and system used. One study found that strain GV3101 provided higher transgene expression in *N. benthamiana* and *N. excelsiana* than strains LBA4404, C58C1, at6, at10, at77 and A4 (Shamloul et al., 2014). Additionally, many *Agrobacterium* strains vary greatly in their T-DNA transfer efficiency. Super virulent strains based on strain A281, such as EHA105, were shown to overexpress *virG*, a transcriptional activator which regulates vir gene expression (Jin et al., 1987). Constitutively activated *virG* mutants (Gao et al., 2006) have been used to increase T-DNA transfer efficiency, even when supplied on a separate plasmid (van der Fits et al., 2000). A mutant form of *virD2* was found to enhance gene delivery to tobacco cells (Reavy et al., 2007). These studies suggest there is potential to improve *Agrobacterium* T-DNA transfer and minimize deleterious plant cell interactions.

In the present study, we investigated the potential for diverse genetic elements to enhance protein production using BeYDV vectors. We show that optimizing the 5- UTR and 3 transcription terminator region substantially enhances the production of GFP, Norwalk virus capsid protein (NVCP), and the monoclonal antibody rituximab. Further, we demonstrate the potential for a MAR to reduce cell death and enhance protein production in a transient expression system. We also show that the choice of *Agrobacterium* strain can play an important role in plant cell death and recombinant protein yield. Using these optimizations, we have achieved yields of vaccine antigens and monoclonal antibodies equal to or greater than the highest levels ever reported in plant systems.

# MATERIALS AND METHODS

#### Vector Construction

Geminiviral Replicon with colE1 Origin of Replication We constructed a T-DNA backbone vector containing the colE1 origin to enable high-copy replication of plasmids in *Escherichia coli*. The T-DNA vector pGPTV-Kan (Becker et al., 1992) was digested with *Bgl*II and the vector fragment ligated to produce pGPTVKbb containing the pRK2 oriV, *trfA*, and *nptIII* (kanamycin resistance) genes. The colE1 origin from pUC19 was amplified by PCR with primers oriE-Pst-F and oriE-Mlu-R (**Table 1**), digested with *Pst*I-*Mlu*I and ligated with pGPTVKbb digested likewise, to yield pVEKtrf, which was digested with *BspE*I and religated to produce pEKtrf (thus lacking oriV). The

#### TABLE 1 | Oligonucleotides used in this study.


*(Continued)*

#### TABLE 1 | Continued


oriV segment was amplified by PCR from pGPTV-Kan with primers oriV-Bgl-F and oriV-R1-R, digested with *Bgl*II-*EcoR*I and ligated with pEKtrf digested *Bgl*II-MfeI to give pEKtrfV. A DNA segment containing the *A. tumefaciens* T-DNA left border was inserted by ligation of the 2631 bp *Bgl*II-BspEI fragment from pHB114 (Richter et al., 2000) with pEKtrfV digested likewise, yielding pEKtrfVa. The backbone from pEKtrfVa was incorporated into a geminiviral replicon T-DNA vector by a 3-fragment ligation: pEKtrfVa digested *Pvu*I-*BspE*I, pBYR2p19 (Chen et al., 2011) digested *Pvu*I-*Xba*I (2747 bp), and pBYR2p19- GFP digested *Xba*I-*BspE*I (3677 bp), yielding pBYR2e-GFP. The GFP cds in pBYR2e-GFP was replaced with the pUC19 polylinker (*Xba*I to *Sac*I) by digestion/ligation of both plasmids with *Xba*I-*Sac*I, to make pBYR2eFa.

#### 3**-**

Terminator Constructs We constructed geminiviral replicons with different 3 terminator regions downstream of reporter genes. pBYGFP.R (Huang et al., 2009) contains the tobacco etch virus (TEV) 5- UTR and the soybean *vspB* 3 region flanking the GFP cds. The tobacco (*Nicotiana tabacum*) extensin gene 3 flanking region, 732 bp including an intron of 226 bp, was amplified by PCR using primers Ext1 and Ext2, which introduced a *Sac*I site at the 5- end and *EcoR*I site at the 3 end. After digestion with *Sac*I and *EcoR*I, the extensin 3 region was substituted for the *vspB* 3 region in pBYGFP.R to make pBYGFP.REF. Constructs pBYNVCP.R and pBYNVCP.REF were generated by replacing the GFP coding sequence of the pBYGFP.R and pBYGFP.REF with the NVCP cds from psNV210 (Zhang and Mason, 2005) using *Xho*I and *Sac*I sites.

#### 5**-**

UTR Constructs We constructed expression vectors having different 5- UTRs linked to reporter genes (**Table 2**). The shuttle cloning vector pBY-GFP212 was constructed by 4-fragment ligation: pBY027 (Mor et al., 2003) digested *Pst*I-*EcoR*I (vector), pBTI210.3 (Judge et al., 2004) digested *Pst*I-*Nco*I (820 bp 35S promoter + TMV 5- UTR), pGFPi210 (Huang et al., 2009) digested *Nco*I-*Sac*I (726 bp GFP cds), and pBYR2p19 digested *Sac*I-*EcoR*I (482 bp tobacco extension 3 region). Oligonucleotides (**Table 1**) encoding different 5- UTR segments were designed to anneal with 5 ends compatible with a cut *Xho*I site (5 protruding TCGA) and 3 ends compatible with a cut *Nco*I site (5 protruding CATG). The annealed oligonucleotides were phosphorylated with polynucleotide kinase + ATP, and ligated with pBY-GFP212 digested *Xho*I-*Nco*I to produce the various 5- UTR constructs, pBY-GFP212-XX. The constructs were ligated into pBYR2eFa on *Mfe*I-*Sac*I fragments, to give various P35S-5- UTR-GFP-Ext3 constructs named pBYR2eXX-GFP (**Figure 1**).

Selected constructs were converted to non-replicating vectors by deletion of the BeYDV Rep genes and the downstream LIR, accomplished by digesting with *Bam*HI-*Avr*II, filling the recessed 3 ends with Klenow fragment DNA polymerase, and ligating the vector fragment, to give plasmids named pBYL2eXX-GFP. A non-replicating construct with TMV 5- UTR was constructed by ligation of pBYL2e20-GFP digested *Mfe*I-*Sac*I (vector) and pBYR2e-GFP digested *Mfe*I-*Sac*I (1150 bp) to yield pBYL2eFc-GFP. Truncations of the *A. thaliana* psaK (PSI) 5- UTR in non-replicating vectors were made. PCR amplification of pBYL2ePSIa-GFP with primers PSI3- -Xho-F and Ext3i-R, digestion of the product with *Xho*I-*Sac*I, and insertion into pBYL2eFc-GFP digested *Xho*I-*Sac*I yielded pBYL2ePSI3- -GFP, containing the 3- 41 nt of the 5- UTR. A similar deletion of the 3 end was produced by PCR amplification of pBYL2ePSIa-GFP with primers 35S-Bsa-F and PSI5- -Xba-R, digestion of the product with *Mfe*I-*Xba*I, and insertion into pBYL2eFc-GFP digested *Mfe*I-*Xba*I to make pBYL2ePSI5- -GFP. Replicating vectors containing the 3- 41 nt of the AtPsaK 5- UTR were generated by digesting pBYL2ePSI3- -GFP with *Xba*I-*Fse*I (vector fragment) and inserting the *Xba*I-*Fse*I fragment from either pBYR2e-GFP or pBYR2eFa-sNV to generate pBYR2eP3-GFP and pBYR2eP3-sNV respectively.

Homologs of *A. thaliana* psaK were identified using the Sol Genomics *N. benthamiana* draft genome (https://solgenomics. net/organism/Nicotiana\_benthamiana/genome). pBYR2e-GFP was digested *Xho*I-*Sac*I and the GFP fragment was inserted into psNV120e (a non-replicating T-DNA vector; details available upon request) digested *Xho*I-*Sac*I yielding pGFPe-TMV. The upstream region from the first psaK homolog, referred to as NbPsaK1, was PCR amplified from *N. benthamiana* genomic DNA using primers PsaK1-Xho-F and PsaK1-Xba-R, and the second homolog, referred to as NbPsaK2, was amplified similarly using primers PsaK2-Xho-F and PsaK2-Xba-R. The PCR fragments were digested with *Xho*I-*Xba*I and ligated into pGFPe-TMV digested *Xho*I-*Xba*I yielding pGFPe-NbPsaK1 and pGFPe-NbPsaK2. A truncation of the NbPsaK1 5- UTR was generated by PCR amplifying pGFPe-NbPsaK1 with primers

#### TABLE 2 | List of the 5**-**UTR DNA sequences used in this study.


*Sequences shown here do not include the nucleotides "TCGA" that were added at 5 to produce a XhoI overhang. At 3*- *, some constructs used "ACC" to accommodate a NcoI site, and others used "TCTAGAACA" to accommodate a XbaI site.*

PsaK1T-Xho-F and Ext3i-R, digestion of the product with *Xho*I-*Sac*I, and insertion into pGFPe-TMV digested *Xho*I-*Sac*I to yield pGFPe-NbPsaK1T. pGFPe-NbPsaK2T was created similarly using PCR primers PsaK2T-Xho-F and Ext3i-R, followed by *Xho*I-*Sac*I digestion and insertion into pGFPe-TMV digested *Xho*I-*Sac*I.

Selected 5- UTRs were modified to contain *Xba*I sites at the 3 end for fusion with the NVCP cds. The shuttle vectors (pBY-GFP212-XX) were amplified by PCR with reverse primers containing an *Xba*I site and M13-R, and the resulting products digested *Pst*I-*Xba*I and ligated with pBYR2eFa-sNV digested likewise, thus yielding the various vectors names *pBYR2eXX-sNV*.

FIGURE 1 | Vector Map. Generalized schematic representation of the T-DNA region of the BeYDV vectors used in this study. RB and LB, the right and left borders of the T-DNA region; NOS3- , *Agrobacterium* nopaline synthase 3 element; P19, tomato bushy stunt virus P19 silencing suppressor; PNOS, *Agrobacterium* nopaline synthase promoter; LIR, long intergenic region of the BeYDV genome; 5- /3- MAR, tobacco Rb7 matrix attachment region; P35S, 35S promoter from cauliflower mosaic virus; Rep/RepA, C1/C2 ORFs from BeYDV encoding the viral replication proteins. The 5- UTR, terminator, and 5- /3- MAR elements are as described in each subsequent section.

The rituximab heavy chain was obtained by PCR amplifying pMAP-RitX-G1-B (a kind gift from Mapp Biopharmaceuticals, San Diego, CA, USA) with primers BAA-Xba-F and RituxG-Sac-R. The resulting PCR fragment was digested with *Xba*I-*Sac*I and ligated into a derivative of pBY027 (Mor et al., 2003), digested likewise, yielding pBYR0-LRtxGT. The rituximab light chain was similarly cloned by amplifying pMap-RitX-K-b (Mapp Biopharmaceuticals) with BAA-Xba-F and RituxK-Sac-R, digested *Xba*I-*Sac*I, and ligated into a derivative of pBY027 digested likewise to yield pBYR0-LRtxKF. To generate T-DNA vectors, the rituximab heavy chain was obtained by *Xho*I-*Sac*I digestion of pBYR0-LRtxGT and inserted into pBYR2e-GFP (vector) digested *Xho*I-*Sac*I to yield pBYR2e-MRtxG. The rituximab light chain was obtained by *Xho*I-*Sac*I digestion of pBYR0-LRtxKF and inserted into pBYR2e-GFP digested *Xho*I-*Sac*I to yield pBYR2e-MRtxK. The AtPsaK 5- UTR fused to the rituximab heavy and light chains was obtained by digesting pBYR0-LRtxGT XbaI-SacI (heavy chain) or pBYR0- LRtxKF *Xba*I-*Sac*I (light chain) and ligating into pBYR2ePSI-GFP (vector) digested *Xba*I-*Sac*I to yield pBYR2ePSI-MRtxG and pBYR2ePSI-MRtxK respectively.

MAR Constructs The tobacco Rb7 MAR was PCR amplified from genomic DNA using primers Mar-1 and Mar-2 designed to create *EcoR*I sites on either end. The amplified fragment was digested with *EcoR*I and ligated into pBY027 digested likewise to yield pBY027-MAR. pBY027-MAR was PCR amplified with primers Mar-1 and Mar-Kpn-2 to create a KpnI site on the 3 end. To generate a *Kpn*I in the BeYDV vector, primers LIR-R and Kpn-F-SIR were used to amplify the LIR-C1/C2-SIR segment of pBYGFP.REF. A 3 fragment ligation consisting of pBYGFP.REF (vector) digested *Hind*III-*EcoR*I, the *Hind*III-*Kpn*I digested segment of the LIR-C1/C2-SIR PCR product, and the *Kpn*I-*EcoR*I digested MAR fragment was used to make pBYR-GEM. To create the 5- MAR, pBY027-MAR was PCR amplified with primers Mar-Pst1 and Mar-Pst2. The product was digested with *Pst*I and ligated into pBYR-GEM digested with *Sbf*I to make pBYR-MGEM. The *Sac*I-*Fse*I fragment containing the 3- MAR from pBYR-MGEM was ligated into vectors containing the rituximab heavy chain (pBYR2e-MrtxG) or light chain (pBYR2e-MRtxK) to yield pBYR2e-MRtxGM and pBYR2e-MRtxKM respectively. The 5- + 3- MAR rituximab construct was created by digesting pBYR-MGEM with *Xho*I-*Asc*I to obtain the 5- MAR fragment, and ligating it into pBYR2e-MRtxG or pBYR2e-MRtxK, yielding pBYR2e-MMGM and pBYR2e-MMKM respectively.

#### Agroinfiltration of *Nicotiana benthamiana* Leaves

Binary vectors were separately introduced into *Agrobacterium tumefaciens* LBA4404, LBA4301, GV3101, or EHA105 by electroporation. The resulting strains were verified by restriction digestion or PCR of plasmid DNA, grown overnight at 30◦C, and used to infiltrate leaves of 5- to 6-week-old *N. benthamiana* maintained at 23–25◦C. Briefly, the bacteria were pelleted by centrifugation for 5 min at 5,000 *g* and then resuspended in infiltration buffer [10 mM 2-(*N*-morpholino)ethanesulfonic acid (MES), pH 5.5 and 10 mM MgSO4] to OD600 = 0.2. The resulting bacterial suspensions were injected by using a syringe without needle into leaves through a small puncture (Huang and Mason, 2004). For antibody coinfiltrations, *Agrobacterium* suspensions were mixed such that the final concentration of each corresponded to OD600 = 0.2. Plant tissue was harvested at 4 DPI unless otherwise noted.

#### Protein Extraction

Total protein extract was obtained by homogenizing agroinfiltrated leaf samples with 1:5 (*w:v*) ice cold extraction buffer (25 mM sodium phosphate, pH 7.4, 100 mM NaCl, 1 mM EDTA, 0.2% Triton X-100, 10 mg/mL sodium ascorbate, 10 mg/mL leupeptin, 0.3 mg/mL phenylmethylsulfonyl fluoride) using a Bullet Blender machine (Next Advance, Averill Park, NY, USA) following the manufacturer's instruction. To enhance solubility, homogenized tissue was rotated at room temperature for 30 min. The crude plant extract was clarified by centrifugation at 10,000 *g* for 10 min at 4◦C. Protein concentration of clarified leaf extracts was measured using a Bradford protein assay kit (Bio-Rad) with bovine serum albumin as standard.

#### SDS-PAGE

For SDS-PAGE, clarified plant protein extract was mixed with sample buffer (50 mM Tris-HCl, pH 6.8, 2% SDS, 10% glycerol, 200 mM dithiothreitol, 0.02 % bromophenol blue), boiled for 10 min, and separated on 4–15% polyacrylamide gels (Bio-Rad). For GFP fluorescence, PAGE gels were visualized under UV illumination (365 nm). PAGE gels were stained with PageBlue protein staining solution (Thermo Fisher) or Coomassie stain (Bio-Rad) following the manufacturer's instructions. Following protein staining, the 26 kDa band corresponding to GFP or the 58 kDa band corresponding to NVCP were analyzed using ImageJ software to quantify the band intensity.

#### ELISA

NVCP concentration was analyzed by sandwich ELISA as described (Mason et al., 1996). Briefly, a rabbit polyclonal anti-NVCP antibody was bound to 96-well high-binding polystyrene plates (Corning), and the plates were blocked with 5% nonfat dry milk in PBS. After washing the wells with PBST (PBS with 0.05% Tween 20), the plant extracts were added and incubated. The bound NVCP were detected by incubation with guinea pig polyclonal anti-NVCP antibody followed by goat antiguinea pig IgG-horseradish peroxidase conjugate. The plate was developed with TMB substrate (Pierce) and the absorbance was read at 450 nm. Plant-produced NVCP was used as the reference standard (Kentucky BioProcessing).

For rituximab quantification, plant protein extracts were analyzed by ELISA designed to detect the assembled form of mAb (with both light and heavy chains) as described previously (Giritch et al., 2006). Briefly, plates were coated with a goat antihuman IgG specific to gamma heavy chain (Southern Biotech, Birmingham, AL, USA). After incubation with plant protein extract, the plate was blocked with 5% non-fat dry milk in PBS, then incubated with a HRP-conjugated anti-human-kappa chain antibody (Southern Biotech) as the detection antibody. Human IgG was used as a reference standard (Southern Biotech).

#### GFP Fluorescence

Leaves producing GFP were photographed under UV illumination generated by a B-100AP lamp (UVP, Upland, CA, USA). The GFP fluorescence intensity was examined on a microplate reader (Molecular Device Co, Spectra Max M2). GFP samples were prepared by serial twofold dilution with phosphate buffered saline (PBS, 137 mM NaCl, 2.6 mM KCl, 10 mM Na2HPO4, and 1.8 mM KH2PO4, pH 7.4) and 50 μl of each sample was added to black-wall 96-well plates (Corning), in duplicate. The excitation and emission wavelength were 485 and 538 nm, respectively. All measurements were performed at room temperature and the reading of an extract from an uninfiltrated plant leaf was subtracted before graphing. *E. coli* expressed GFP was used to generate the standard curve. GFP gene was cloned into the pET28 expression vector (Invitrogen) and IPTG-induced GFP was purified using TALON His-Tag purification resin (Clontech).

# cDNA Synthesis and Quantitative RT-PCR

Total RNA was prepared using Plant RNA Reagent (Invitrogen) according to the manufacturer's protocol and residual DNA was removed using the DNAfree system (Ambion). Aliquots of 1 μg of total RNA were subjected to first-strand cDNA synthesis with oligo dT20 primer using a Superscript III First-Strand Synthesis System (Invitrogen) according to the manufacturer's instructions: a 10 μl reaction was prepared using 1 μg total RNA, 100U Superscript III Reverse Transcriptase (Invitrogen) and 50 pmol oligo dT20 primers. The reaction as carried out for 50 min at 50◦C, and then 5 min at 85◦C to deactivate the enzyme. The cDNA was stored at −20◦C. Quantitative RT-PCR was performed on an IQ5 Real-Time PCR Detection System (Bio-Rad). For GFP transcripts, gene specific primers (GFP-f and GFP-r) and custom made Taqman probe (GFPpro, Integrated DNA Technologies) were used. As an internal control, *N. benthamiana* translation elongation factor 1 alpha (EF1a, accession number AY206004) was used (primers EF1f, EF1r and EF1p). Each sample was measured in triplicate for GFP transcripts and an internal reference gene and compared to a standard curve using purified plasmid DNA. Reactions contained in 25 μl: 2 μl of 1:10 diluted cDNA, 2.5 μl of 10X Ex Taq buffer (20 mM MgCl2)*,* 1 μl of 10 μM each gene-specific primers, 0.5 μl of 10 μM gene specific probe labeled with FAM (Integrated DNA Technologies), 0.2 μl of Ex Taq polymerase (5 U/μl) and 17.8 μl of distilled water. PCR conditions were: 50◦C for 2 min, 95◦C for 10 min, followed by 40 cycles at 95◦C for 15 s and 55◦C for 30 s. Relative quantification of target gene transcript was estimated using standard curve of Ct values generated using 10-fold serial dilution of plasmid DNA.

#### Statistical Analysis

For each experiment, plants of the same age were used to minimize developmental differences. Additionally, experiments were designed to compare each construct directly on the same leaf to minimize leaf-to-leaf variation. Comparisons between leaves were made using leaves of similar developmental stage. Data are presented as mean ± SE. Three or more independent infiltrations were made for each experiment and compared using Student's *t*-test (two-tailed). *P <* 0.05 was represented with two stars (∗∗) and *P <* 0.01 was represented with three stars (∗∗∗).

# RESULTS

#### Tobacco Extensin Terminator Enhances Transgene mRNA Accumulation and Protein Production

We previously reported BeYDV vectors that contained the soybean *vspB* terminator following the gene of interest (Huang et al., 2009). We tested several 3 elements and identified the tobacco extensin terminator as an efficient transcription terminator and a potent enhancer of transgene expression, as compared to the nopaline synthase, 35S, and other commonly used terminators (data to be presented elsewhere). Extensin is a hydroxyproline-rich glycoprotein that constitutes the major protein component of cell walls.

To test the potential of the tobacco extensin terminator to enhance protein production in the BeYDV system, *N. benthamiana* leaves were agroinfiltrated with replicating GFP vectors containing either the soybean *vspB* terminator or the tobacco extensin terminator (**Figure 1**). Protein extracts from agroinfiltrated leaf samples were analyzed from 3 to 5 DPI by spectrofluorimetry. Vectors containing the tobacco extensin terminator provided an approximately 2.5-fold increase in GFP production compared to *vspB* (**Figure 2A**). Quantitative realtime RT-PCR showed that the increase in GFP was associated with a similar 2.5-fold increase in GFP transcripts (**Figure 2B**).

Next, we wanted to determine whether the enhancing effect of the tobacco extensin terminator was gene-specific. Vectors producing NVCP were agroinfiltrated and analyzed by ELISA. NVCP concentration was normalized by total soluble protein (TSP) to eliminate differences in the extraction efficiency or leaf water weight. An approximately sixfold increase in NVCP production was observed with the tobacco extensin terminator compared to *vspB* (**Figure 2C**). We have also shown that the tobacco extensin terminator improves the production of monoclonal antibodies by a similar level (data not shown). These data indicate that the tobacco extensin terminator has strong potential to enhance transgene expression, likely by stabilization of the mRNA, and its enhancing effect is not gene-specific. The tobacco extensin terminator was used for all further studies.

#### Diverse 5**-** UTRs Greatly Impact Transgene Production

Previously, we reported BeYDV vectors that contained the 5- UTR from either tobacco etch virus (TEV) or tobacco mosaic virus (TMV). In order to systematically evaluate the role of the 5- UTR on transgene production, we created a series of BeYDV transient expression vectors containing diverse 5-UTRs

independently infiltrated samples.

from viral, plant, and human sources upstream from the green fluorescent protein (GFP) gene (**Figure 1**; **Table 2**). As the nucleotides directly surrounding the start codon are known to play a role in translation initiation, we standardized all vectors to contain the nucleotides ACC (to accommodate the *Nco*I site) or ACA preceding the ATG, which has been reported to be optimal for dicot plants (Sugio et al., 2010). We found no difference in the performance of vectors with ACC or ACA. These vectors were delivered to *N. benthamiana* leaves by agroinfiltration and monitored for green fluorescence. To minimize leaf-to-leaf variation, each leaf was infiltrated with a vector containing the TMV 5- UTR as an internal control alongside vectors containing the 5-UTRs to be tested.

First, we compared a set of 11 human-derived sequences found to provide cap-independent translational enhancement (Wellensiek et al., 2013). There is evidence that some 5- UTR elements function cross-kingdom, especially A-rich polypurine sequences (Dorokhov et al., 2002; Terenin et al., 2005). We found that many of the human 5- UTRs, as well as a polypurine sequence, produced bright green fluorescence under UV illumination (**Figure 3A**). To further analyze GFP production, protein extracts from agroinfiltrated leaves were separated by SDS-PAGE followed by Coomassie staining or visualization under UV light, and the GFP band intensity was quantified by densitometry. Gel quantification showed many of the human-derived 5- UTRs, as well as the polypurine 5- UTR, produced GFP at a level comparable to the commonly used plant viral 5- UTRs from TEV and TMV (**Figure 3B**). These data indicate that 5- UTRs from sources outside of the plant kingdom can support high levels of translation in *N. benthamiana*.

Next, we tested the 5- UTRs from RNA plant viruses. Among the 5- UTRs tested, the TMV 5- UTR appeared to provide the brightest fluorescence, followed by TEV (**Figure 3A**). Using gel quantification, the TMV 5- UTR provided a *>*40% increase in GFP yield over the TEV 5- UTR (**Figure 3B**). A truncation containing only nucleotides 1–22 of the TMV 5- UTR performed as well as the full length sequence, indicating that the poly(CAA) region of the TMV 5- UTR is not necessary for high levels of translation in *N. benthamiana*, at least in a replicating system (**Figure 3B**). Constructs containing the 5- UTR from AMV, as well as constructs containing the 5 and 3- UTRs from BYDV or pea enation mosaic virus (PEMV; Fan et al., 2012), showed poor GFP production and were not studied further (**Figure 3A**).

We also wished to test the activity of plant-derived 5- UTRs in the BeYDV system. It has been reported that a 5- UTR derived from 63 nucleotides upstream from the start codon of the *A. thaliana* photosystem K subunit (AtPsaK) enhanced transgene expression in transgenic tobacco leaves (Agarwal et al., 2014). Using our system, we found that the 63nt AtPsaK 5- UTR produced intense green fluorescence, and gel quantification data indicate that GFP production was increased by *>*20% compared to the TMV 5- UTR (**Figures 3A,B**). The AtPsaK 5- UTR (accession NM\_102775) appears to be a truncation of the fulllength 129 nt 5- UTR. To further delineate the active region of the AtPsaK 5- UTR, we created deletions at its 5 and 3 ends and tested their potential to enhance GFP production in nonreplicating transient expression vectors. Using gel quantification, we found that a truncation removing nucleotides −1 to −23 upstream from the start codon resulted in a ∼14% decrease in GFP production, while a truncation removing nucleotides −42 to <sup>−</sup>63 resulted in a <sup>∼</sup>13% increase in GFP production (**Figure 4**). A similar ∼12% enhancement was observed in replicating GFP vectors (**Figure 3B**).

To determine whether the 5- UTRs from related psaK genes have potential for high levels of transgene production, two psaK homologs were identified from *N. benthamiana*

FIGURE 3 | Evaluation of diverse 5from the GFP gene vectors (vector pBYR2eXX-GFP, where XX denotes the individual 5- UTRs). (A) Leaves were photographed at 4 DPI under UV illumination (365 nm). Images are representative of 3–4 independently infiltrated leaves. (B) Agroinfiltrated leaves were harvested between 4 and 5 DPI and extracts were analyzed by SDS-PAGE followed by observation under UV illumination (365 nm) and Coomassie staining. GFP band intensity was quantified using ImageJ software, using native plant proteins as a loading control. Columns represent means ± standard error of three or more independently infiltrated leaves. All leaves were infiltrated with the TMV 5- UTR vector in addition to the other vectors as an internal control for leaf and plant variability. Two stars (∗ ∗) indicate *p <* 0.05 and three stars (∗∗∗) indicate *p <* 0.01 as compared to TMV by Student's *t*-test. 5- UTR key (position -1 taken as first nucleotide upstream from ATG): TMV, tobacco mosaic virus full length 5' UTR; AtPsaK 3- , nucleotides −1 to −41 of AtPsaK gene; AtPsaK, nucleotides −1 to −63 of AtPsaK gene; TMV 3- , nucleotides −1 to −21 of TMV; TEV, tobacco etch virus full length 5- UTR; PP, synthetic polypurine sequence; AMV, full length alfalfa mosaic virus 5- UTR; BYDV, full length barley yellow dwarf virus 5 and 3- UTRs; PEMV, full length pea enation mosaic virus RNA 2 5 and 3- UTRs; 10; 20; 5D5; 43; 19; 12; 13; 23; 54; 48; 26, human-derived 5-UTR sequences.

(referred to as NbPsaK1 and NbPsaK2) and their 5 upstream regions were cloned into non-replicating GFP expression vectors. These vectors were agroinfiltrated alongside the TMV and AtPsaK 5- UTRs for comparison. By gel quantification, the 163 nt upstream region from NbPsaK1 was found to have very minimal activity, whereas the 170 nt upstream region from NbPsaK2 produced GFP at ∼50% of the level of the TMV 5- UTR (**Figure 4**). Inspection of the nucleotide sequence revealed the presence of upstream ATGs in both NbPsaK1 and NbPsaK2. As the 3 end was the most active region of the AtPsaK 5- UTR, similar truncations were made for the NbPsaK upstream regions (referred to as NbPsaK1 3 and NbPsaK2 3- ). These new constructs were agroinfiltrated alongside the fulllength version and tested by gel quantification. The NbPsaK1 truncation enhanced GFP production by *>*20-fold compared to the original vector (**Figure 4**). The NbPsaK2 truncation enhanced GFP production by 2.4-fold compared to the original vector, corresponding to a *>*40% improvement compared to the TMV 5- UTR (**Figure 4**). These results indicate that the regions 40– 60 nt upstream from the *A. thaliana* and *N. benthamiana* psaK genes are highly active in *N. benthamiana* leaves, and are capable of enhancing protein production at a level greater than the widely used TMV 5-UTR.

To further assess the potential of the 5- UTR to improve transgene production, several promising 5- UTRs were tested in BeYDV vectors producing NVCP. Protein extracts from agroinfiltrated leaf samples were normalized for TSP and analyzed by NVCP ELISA. In general agreement with the results found for GFP, several of the human 5- UTRs performed as well as the TMV 5- UTR, and the TMV 5- UTR resulted in a ∼30% increase in NVCP production compared to the TEV 5-UTR (**Figure 5**). Additionally, vectors containing the AtPsaK 5- UTR produced NVCP at 15.9 ± 1.5% TSP compared to 11.3 ± 1.0% TSP for the TMV 5- UTR (**Figure 5**). Further, the truncated form of the AtPsaK 5- UTR (AtPsaK 3- ) produced as much or more NVCP as the unmodified 5- UTR. These data further demonstrate the capacity of the 5- UTR to enhance recombinant protein production, and show that the enhancing activity of the unmodified or truncated AtPsaK 5-UTR is not gene-specific.

## Matrix Attachment Regions Enhance Transgene Production and Reduce Plant Cell Death

The presence of MARs has been reported to enhance transgene production using transgenic systems (Halweg et al., 2005; Xue et al., 2005; Ji et al., 2013). Many of the postulated mechanisms by which MARs enhance transgene production, such as by preventing repressive chromatin modifications or by the interaction of chromatin with the nuclear matrix, require the gene of interest to be organized into chromatin. Thus it is unclear whether MARs would function in transient expression systems that do not involve stable chromosomal integration. However, replicated geminivirus DNA has been shown to associate with cellular histones, forming viral minichromosomes (Pilartz and Jeske, 1992, 2003). Therefore, we investigated the potential for MARs to improve BeYDV vectors.

The tobacco Rb7 MAR was inserted into BeYDV vectors (**Figure 1**) either with two copies flanking the expression cassette (5- + 3- MAR), or one copy in the 3 position (3- MAR). Placing the MAR only in the 5 position was not found to be as effective as the other two configurations in preliminary studies and was not pursued further (data not shown). Leaves of *N. benthamiana* were co-infiltrated with BeYDV vectors containing the rituximab heavy and light chains both either with or without the Rb7

MAR at either the 3 or 5- + 3 positions. Protein extracts from infiltrated leaf spots were normalized for TSP and assayed for rituximab production by IgG ELISA. Remarkably, it was found that while both MAR-containing vectors enhanced IgG production, the vector containing only the 3- MAR resulted in a 3.4-fold increase in IgG production, representing 14.3 ± 1.6% TSP for the 3- MAR vector compared to 4.2 ± 1% TSP for the control with no MAR elements (**Figure 6A**). Inspection of the infiltrated leaves revealed a substantial reduction in leaf tissue necrosis with the MAR-containing vectors (**Figure 6B**). These data indicate that MARs have potential to enhance protein production using geminiviral transient expression vectors, and this enhancement is correlated with a reduction in plant cell death.

## Effects of *Agrobacterium* Strain on Transgene Production and Cell Death

The choice of *Agrobacterium* strain has been shown to play an important role in many aspects of transient protein production, including T-DNA transfer efficiency, plant health, and overall yield (Gleba et al., 2014; Shamloul et al., 2014; Sheikh et al., 2014). To investigate the effects of *Agrobacterium* strain on recombinant protein production, we introduced BeYDV GFP vectors to strains LBA4301, GV3101, and EHA105. Leaves of *N. benthamiana* were agroinfiltrated with each strain at OD600 of 0.2 and monitored for plant health and GFP production. At 4 DPI, spots infiltrated with GV3101 developed faint leaf browning, whereas the other two constructs had no detectable changes from uninfiltrated leaf tissue (data not shown). By 7 DPI, leaf spots infiltrated with GV3101 had become severely necrotic, while EHA105 or LBA4301 only had just begun to develop necrotic tissue (**Figure 7B**). Inspection of leaves under UV light revealed that fluorescing leaf regions infiltrated with EHA105 were substantially brighter than areas infiltrated with either of the other strains (**Figure 7A**).

To further compare the effects of EHA105 and GV3101 more quantitatively, BeYDV rituximab vectors were introduced to each strain and agroinfiltrated into *N. benthamiana* leaves. Leaf extracts were normalized for TSP and analyzed by IgG ELISA. In agreement with the data obtained using GFP vectors, *Agrobacterium* strain EHA105 substantially improved rituximab production: 10.9 ± 1.6 % TSP for EHA105 compared to 5.3 <sup>±</sup> 1.1% TSP for GV3101 (**Figure 7C**). Additionally, the increase in rituximab production was correlated with a reduction in plant cell death. These results demonstrate the importance of *Agrobacterium* strain on improving recombinant protein production.

# Optimized Genetic Elements Function Synergistically to Further Enhance Transgene Production

We determined the potential for the enhancing effects of the genetic elements identified in the present study to function synergistically with one another. *Agrobacterium* strain EHA105 was observed to perform better with all tested constructs, and was used for the remainder of studies (data not shown). BeYDV

TMV by Student's *t*-test.

containing 5- + 3-MAR (left half of leaf) or no MAR (right half of leaf). A representative leaf was photographed under visible light at 4 DPI.

FIGURE 7 | *Agrobacterium* strain EHA105 increases transgene production and reduces cell death. (A,B) Leaves of *N. benthamiana* were infiltrated with *Agrobacterium* strains EHA105, GV3101, or LBA4301, each harboring a BeYDV GFP vector (pBYR2eFa-GFP). Representative images of four independently infiltrated leaves were photographed at 4 DPI under UV illumination (A) or 7 DPI under visible light (B). (C) Leaves of *N. benthamiana* were infiltrated with *Agrobacterium* strains EHA105 or GV3101 harboring BeYDV rituximab vectors (pBYR2e-MRtxG and pBYR2e-MRtxK). Leaf extracts were analyzed for rituximab production by sandwich ELISA and data was normalized by total soluble protein. Columns represent data from four independently infiltrated samples ± standard error. Two stars (∗∗) indicates *p <* 0.05 by Student's *t*-test.

NVCP vectors were created which contained the AtPsaK 5- UTR and extensin terminator with or without the 3- Rb7 MAR, and were agroinfiltrated using strain EHA105 into *N. benthamiana* leaves. NVCP ELISA showed that insertion of the 3- Rb7 MAR paired with the AtPsaK 5- UTR and extensin terminator significantly enhanced NVCP production, yielding 20.3 ± 1.5% TSP compared to 15.7 ± 1.3% TSP for the construct lacking the MAR (**Figure 8A**, last two columns). This yield corresponds to 1.8 mg NVCP per gram leaf fresh weight. These results, compared with previous data, indicate that the optimizations identified in this study provide synergistic enhancement of transgene production, enabling very high levels of recombinant protein production (**Figures 8A,B**).

#### DISCUSSION

In recent years, transient expression systems have become the method of choice for plant-based recombinant protein production. In addition to their high yields, the rapid speed

of these systems (4–5 days for BeYDV vectors) offers many unique advantages over stable transgenic systems, such as the ability to produce personalized therapeutics as reported for non-Hodgkin's lymphoma (Bendandi et al., 2010; Tuse et al., 2015), and to rapidly respond to virus outbreaks or bioterrorism events (D'Aoust et al., 2010). Additionally, transient expression systems circumvent the regulatory issues associated with the creation of genetically modified organisms.

The most widely used transient expression system, magnICON, uses viral vectors derived from TMV and PVX (Giritch et al., 2006). Due to the competing nature of many RNA viruses, this system cannot produce recombinant proteins with more than two heterosubunits, excluding the efficient production of secretory IgAs, IgMs, and heteromultimeric virus-like particles, among other desirable biopharmaceuticals (Chen and Lai, 2013). A non-replicating system based on cowpea mosaic virus was has been used to produce bluetongue virus-like particles, allowing proper assembly of four heterosubunits (Thuenemann et al., 2013b). However, this system lacks the high yields associated with the other replicating systems (Gleba et al., 2014).

To circumvent these issues, we developed a transient expression system based on BeYDV which generates noncompeting DNA replicons to drive high-level production of heteromultimeric proteins (Huang et al., 2010). While this system generates massive amounts of DNA copies of the target gene(s) which are thought to result in a saturation of the plant transcription machinery, the disparity between gene copy number, transcript accumulation, and protein production suggested that each transcript was not being efficiently utilized by the plant cell (Huang et al., 2009, 2010; Regnard et al., 2010). Therefore, we hypothesized that optimizing the genetic elements involved in efficient transcript processing, stability, and utilization could further improve the BeYDV system.

In the current study, we present a comprehensive comparison of diverse genetic elements and assess their potential to enhance plant-based recombinant protein production. We compared a large set of 5- UTRs derived from human, plant, and viral sequences. In agreement with previous studies demonstrating cross-kingdom translational enhancement of certain 5- UTRs (Dorokhov et al., 2002; Terenin et al., 2005), we found that many of the human sequences, as well as the polypurine 5- UTR described by Dorokhov et al. (2002) provided high levels of GFP production in leaves of *N. benthamiana*, in some cases out-performing the routinely used viral 5- UTRs from tobacco etch virus or AMV (**Figures 3A,B**). Among the virus-derived 5- UTRs tested, we found the TMV 5- UTR provided the highest level of transgene expression. Some of the viral elements tested, especially those containing long 3- UTRs, performed very poorly. As RNA viruses are not typically adapted for the plant nucleus, many of these sequences may contain cryptic splice sites or other detrimental elements. We suspect that rigorous optimization of these sequences, such as through the insertion of introns and removal of sequences known to destabilize mRNA, could significantly improve the performance of genetic elements derived from these viruses.

Interestingly, despite the historic success of viral elements in driving high levels of protein production, we also found that the plant-derived 5- UTRs from the psaK homologs of both *A. thaliana* and *N. benthamiana* were capable of enhancing recombinant production by as much as 40% more than the widely used TMV 5- UTR (**Figure 4**). In particular, the first 40– 60 nt directly upstream from the ATG seemed the most potent, possibly due to the removal of inhibitory regulatory sequences further upstream (**Figure 4**). Furthermore, we investigated the potential of the tobacco extensin terminator to enhance transgene production. It was found to prevent read-through transcription, enhance mRNA accumulation, and enhance protein production at a level greater than the 35S or nopaline synthase terminators, among other commonly used gene terminators (data to be presented elsewhere, **Figure 2**). We anticipate that further investigation of other native genetic elements from highly expressed plant genes has great potential to improve recombinant protein production systems.

Matrix attachment regions have a well-supported history of enhancing transgene production in transgenic plants (Halweg et al., 2005; Xue et al., 2005; Wang et al., 2007; Zhang et al., 2009; Ji et al., 2013), though their features seem variable or, in many cases, poorly understood. In the present study, we show that a MAR increases transgene production and reduces cell death in a plant transient expression system. Replicated geminivirus DNA has been shown to be organized into chromatin (Pilartz and Jeske, 1992; Pilartz and Jeske, 2003), and subject to repressive DNA methylation (Raja et al., 2008), indicating that MARs could be functionally active in BeYDV replicons. We found that insertion of the Rb7 MAR had a substantial enhancing effect on rituximab production, improving yield by 3.4-fold (**Figure 6A**). The MAR was most active when placed 3 of the expression cassette, in contrast to other studies which found optimal placement upstream from the promoter, or in both positions (Zhang et al., 2009). Inspection of the tobacco Rb7 sequence reveals the presence of many polyadenylation and transcription termination signals, suggesting the alternative hypothesis that the 3- MAR is acting as a second gene terminator or otherwise stabilizing the mRNA. Double terminators have been found to have a dramatic enhancing effect on transgene production (Beyene et al., 2011). Unexpectedly, we also found a dramatic decrease in cell death associated with the insertion of the tobacco Rb7 MAR (**Figure 6B**). Further studies are underway to characterize the function of the Rb7 MAR and other MARs in enhancing transgene production and reducing cell death in the BeYDV system.

One of the drawbacks of transient expression systems compared to stable transgenics is the requirement for *Agrobacterium* to deliver the gene of interest to the plants. An ideal *Agrobacterium* strain should minimize deleterious plant cell interactions while providing efficient T-DNA transfer to reduce the concentration of *Agrobacterium* required for complete gene delivery to all plant cells. We wished to evaluate different strains of *Agrobacterium* using the BeYDV system. EHA105 has been reported to overexpress *virG*, a transcriptional activator that regulates T-DNA transfer through induction of vir gene expression (Jin et al., 1987). Additionally, GV3101 and LBA4404 have been reported to have differing effects on the activation of plant immune response genes through the production of cytokinins (Sheikh et al., 2014). Previously, we found that strain GV3101 enhanced transgene production compared to strain LBA4404 (data not shown). In this study, we compared strains GV3101, LBA4301, and EHA105, and found that EHA105 both enhanced transgene production, and reduced plant cell death. Our results demonstrate that *Agrobacterium* strain can have a dramatic effect on recombinant protein production systems (**Figure 7**). A strain CryX is reported to provide 100–1000 times the gene delivery efficiency compared to commonly used *Agrobacterium* strains (Gleba et al., 2014). These studies indicate there is great potential to reduce plant toxicity and improve T-DNA transfer efficiency by optimizing the *Agrobacterium* strain.

# CONCLUSION

By optimizing the gene terminator, 5- UTR, and *Agrobacterium* strain, and by targeted insertion of MAR elements, we have dramatically improved the BeYDV transient expression system. We have used this system to produce NVCP at up to 20% TSP, corresponding to 1.8 mg per gram leaf fresh weight, a *>*4 fold improvement over the original vector (Huang et al., 2009) and more than twice the highest level ever reported in a plantbased system (Santi et al., 2008). Furthermore, we have also produced the monoclonal antibody rituximab at up to 1 mg per gram leaf fresh weight, which is twice the highest level previously reported for a monoclonal antibody using BeYDV vectors (Huang et al., 2010). We expect these improvements to be broadly applicable to other DNA expression systems. Additionally, these modifications could be used to fine-tune expression in cases where multiple proteins need to be produced at different levels.

#### AUTHOR CONTRIBUTIONS

HM planned experiments, constructed expression vectors, and wrote and edited the ms. AD planned experiments, constructed expression vectors, performed experiments, and wrote the ms. SR planned experiments, constructed expression vectors, performed experiments, and wrote the ms.

# FUNDING

This project was supported in part by a grant from the US National Institutes of Health award # U19 AI066332-01, and by funds provided by the Center for Infectious Diseases and Vaccinology, Biodesign Institute at ASU.

#### ACKNOWLEDGMENTS

We wish to thank John Chaput (Biodesign Institute at ASU) for the sequences of the human 5- UTRs and W. A. Miller (Iowa State University) for PEMV and BYDV 5 UTRs; and we thank Reed

#### REFERENCES


Bjorklund and Sean Winkle for excellent technical assistance. We would also like to thank our team of undergraduate and graduate students for their excellent work: Trent Anderson, Abigail Beebe, Amy Capone, John Crawford, David M. Escobedo, and Lindsey Hardison.

its independence of the cap structure or of cap-binding protein for efficient translation. *Biochemistry* 22, 5157–5164. doi: 10.1021/bi00291a015


codon on the rate of translation in dicotyledonous and monocotyledonous plant cells. *J. Biosci. Bioeng.* 109, 170–173. doi: 10.1016/j.jbiosc.2009. 07.009


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer GR and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2016 Diamos, Rosenthal and Mason. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# New Challenges for the Design of High Value Plant Products: Stabilization of Anthocyanins in Plant Vacuoles

#### Valentina Passeri, Ronald Koes and Francesca M. Quattrocchio\*

Plant Development and (Epi)Genetics, Swammerdam Institute of Life Sciences, University of Amsterdam, Amsterdam, Netherlands

In the last decade plant biotechnologists and breeders have made several attempt to improve the antioxidant content of plant-derived food. Most efforts concentrated on increasing the synthesis of antioxidants, in particular anthocyanins, by inducing the transcription of genes encoding the synthesizing enzymes. We present here an overview of economically interesting plant species, both food crops and ornamentals, in which anthocyanin content was improved by traditional breeding or transgenesis. Old genetic studies in petunia and more recent biochemical work in brunfelsia, have shown that after synthesis and compartmentalization in the vacuole, anthocyanins need to be stabilized to preserve the color of the plant tissue over time. The final yield of antioxidant molecules is the result of the balance between synthesis and degradation. Therefore the understanding of the mechanism that determine molecule stabilization in the vacuolar lumen is the next step that needs to be taken to further improve the anthocyanin content in food. In several species a phenomenon known as fading is responsible for the disappearance of pigmentation which in some case can be nearly complete. We discuss the present knowledge about the genetic and biochemical factors involved in pigment preservation/destabilization in plant cells. The improvement of our understanding of the fading process will supply new tools for both biotechnological approaches and marker-assisted breeding.

Keywords: Anthocyanin, fading, product stabilization, health-promoting products, vacuole

#### INTRODUCTION

Anthocyanins are flavonoid pigments conferring red, blue and purple colors to plant tissues. Because they are visible to the naked eye, these pigments are a model for genetics, molecular biology and cell biology. Consequently, both structural and regulatory genes of the biosynthetic pathway are identified in a plethora of species (**Figure 1A**). A complex of highly conserved WD40, bHLH and MYB proteins (MBW complex) activates the transcription of structural genes encoding enzymes of the anthocyanin pathway (Koes et al., 2005; Jaakola, 2013). In all species analyzed, the WD40 is expressed ubiquitously, whereas expression of bHLH and MYB factors is confined to pigmented tissues. The bHLH regulators hook up with the WD40 partner to activate downstream genes involved in multiple pathways like anthocyanin and tannin production, vacuolar acidification and cell shape, through interactions with different MYB proteins, which are main determinants of the specificity of the complex (Koes et al., 2005; Ramsay and Glover, 2005).

#### Edited by:

Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### Reviewed by:

Laura Jaakola, UiT The Arctic University of Norway, Norway Massimiliano Tattini, The National Research Council of Italy, Italy

> \*Correspondence: Francesca M. Quattrocchio

#### Specialty section:

f.quattrocchio@uva.nl

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 09 December 2015 Accepted: 29 January 2016 Published: 16 February 2016

#### Citation:

Passeri V, Koes R and Quattrocchio FM (2016) New Challenges for the Design of High Value Plant Products: Stabilization of Anthocyanins in Plant Vacuoles. Front. Plant Sci. 7:153. doi: 10.3389/fpls.2016.00153

The MYB component of the MBW complex that activate the (pro)anthocyanin pathway is able to activate transcription of its bHLH partner and is therefore consider a "master regulator" as it can, alone, induce activation of the pathway (Spelt et al., 2000; Nesi et al., 2001; Kiferle et al., 2015).After synthesis, anthocyanins are transported to the vacuolar lumen where they are stored. This process is studied by several groups (Francisco et al., 2013; Chanoca et al., 2015; Hu et al., 2016) but it is still not fully understood in spite of the substantial role it might play in the final anthocyanin content in plant tissues.

Plant products rich in anthocyanin like berries, eggplant, grape, and red cabbage, are part of the human diet. Several studies reported that anthocyanin-intake prevents the onset and development of degenerative diseases. Some example of the health promoting effects of anthocyanins are stimulation of visual acuity and reduction of retinal damage (Kalt et al., 2014; Giampieri et al., 2015; Wang et al., 2015), decreased expression of inflammatory biomarkers (Samadi et al., 2015), diminished risk of type-2 diabetes mellitus (Guo and Ling, 2015), reduced weight gain (Titta et al., 2010) and anti-cancerogenic activity (Butelli et al., 2008; Forbes-Hernandez et al., 2015; Vlachojannis et al., 2015). By in vitro simulation of the gastrointestinal system and animal and human tests, anthocyanins were shown to remain bio-accessible during digestion (Kalt et al., 2014; Oliveira and Pintado, 2015; Olejnik et al., 2016).

The presence of anthocyanin in plant tissues positively affects their market value in addition by increasing the aesthetical appeal and by reducing softening, shriveling, rotting and fungal infection (Zhang et al., 2015c). Furthermore color novelty is a major driving force in the ornamentals and cut flower industry.

Increased anthocyanin content is, for all mentioned reasons, an obvious goal for crop breeding and biotechnology. Therefore combinations of classical and molecular methods, have been used to generate new varieties with enhanced anthocyanin content as well as different colors and pigmentation patterns.

Till now, research in ornamental and food crops aimed to alter genes controlling anthocyanin synthesis, since it was taken for granted that the end products are stable once they are deposited in the vacuole. However, for fruits, flowers and leaves of several species it is known that anthocyanin may disappear again during development in a regulated manner that depends, for example on environmental conditions (Oren-Shamir, 2009).

Here we review the state of the art in improving anthocyanin production in plant tissues and report recent insights into the (in)stability of anthocyanins in vacuoles, suggesting that the understanding of the mechanism behind anthocyanin stabilization in planta is required for breeding and biotechnology to take the next step toward plant varieties with increased economical and nutraceutical value.

#### STUDYING FLOWER PIGMENTATION TAUGHT US HOW TO COLOR OUR FOOD

Much of the current knowledge on anthocyanin chemistry and genetics originates from studies on flower pigmentation in model species. Some of the results have been applied to generate new varieties of cut flowers and ornamental flowering plants with novel colors and pigmentation patterns.

The substrate specificity of the enzymes of the anthocyanin pathway determines the final pattern of chemical decorations and thereby the pigment color (Provenzano et al., 2014; Rinaldo et al., 2015). Together with the understanding of the biosynthetic pathway regulation (Koes et al., 2005; Jaakola, 2013), this knowledge was applied to enhance the nutraceutical value and the appeal of several economically relevant plant products.

Traditional breeding has produced an array of colors in different species but the top-selling cut flowers rose, chrysanthemum, carnation and lily do not have blue in their pallet, while petunia lacks red/orange (Holton and Tanaka, 1994; Forkmann and Heller, 1999). New colors were obtained changing the decoration pattern on the basic skeleton of anthocyanins (**Figure 1A**) in roses, chrysanthemum and carnations. The expression of an exogenous flavonol 3<sup>0</sup> ,5<sup>0</sup> hydroxylase (F3<sup>0</sup> 5 <sup>0</sup>H) combined with an heterologous dihydroflavonol 4-reductase (DFR) accepting a three-hydroxylated substrate, leads to accumulation of delphinidin (Katsumoto et al., 2007; Tanaka et al., 2008) and to lilac and purple flowers in rose and carnation (**Figure 1B**). Orange and red colors from pelargonidin-based anthocyanins were obtained in petunia by suppressing the flavonoid hydroxylases F30H and F3<sup>0</sup> 5 <sup>0</sup>H, and expressing a DFR with specificity for mono-hydroxylated substrates (Meyer et al., 1987). New colors are also obtained by changing the anthocyanin pattern of methylation, glycosylation, and acylation (Provenzano et al., 2014; Du et al., 2015; Morita et al., 2015).

The dynamics of metabolic flows affects channeling of precursors toward anthocyanin production (Zvi et al., 2012; Sheehan et al., 2015; Zhang et al., 2015b) and this should be considered when designing strategies to generate genotypes with new colors or enhanced anthocyanin content.

Flower pigmentation patterns originate from differential expression of the structural genes in different cells. While irregular patterns are mostly due to transposon insertions in structural and/or regulatory genes (**Figure 1C**; Lister et al., 1993; Spelt et al., 2000; Itoh et al., 2002), flecks, sector veins and coloration of different flower parts are due to differential expression of genes encoding for MYB proteins of the MBW transcription complex regulating the anthocyanin pathway.

In the genus Antirrhinum variation in activity of the MYB genes Rosea and Venosa regulates pigmentation in different flower parts (Stracke et al., 2001) and in petunia, different members of the same clade of MYB regulators independently pigment petals, anthers and tube (Tornielli et al., 2009). Similarly, in Phalaenopsis orchids three MYBs control spotting and venation patterns by activation of structural genes expression in the sepals/petals (Hsu et al., 2015). Ectopic expression of the Arabidopsis anthocyanin MYB regulator PAP1 in roses results in enhanced pigmentation in leaves and flowers (Zvi et al., 2012).

From the observation of how pigmentation patterns diverged during evolution we learned that MYB regulators of anthocyanin biosynthesis are the best tool to alter anthocyanin production without affecting other processes. This is because their bHLH and WDR partners are involved in several other processes and changes in their activity would either be insufficient or have

FIGURE 1 | Anthocyanin accumulation in different plant products. (A) Scheme of the biosynthetic pathway for different flavonoid pigments among which anthocyanins. The main enzymes catalyzing the reactions in the pathway are reported in blue. PAL, phenylalanine ammonia-lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate-CoA ligase; CHS, chalcone synthase; CHI, chalcone isomerase; FLS, flavonol synthase; F3H, flavonoid 3 hydroxylase; F30H, flavonol 3<sup>0</sup> hydroxylase; F305 <sup>0</sup>H, flavonol 3<sup>0</sup> ,5<sup>0</sup> hydroxylase; DFR, dihydroflavonol 4-reductase; LAR, leucoanthocyanidin reductase; ANR, anthocyanidin reductase; ANS, anthocyanidin synthase; 3UFGT, UDP glucose:flavonoid 3-O-glucosyltransferase; RT, rhamnosylation at three; 5UFGT: glucose:flavonoid 5-O-glucosyltransferase; AAT, anthocyanin acyltransferase; MT, methyltransferase; GST, glutathione S-transferase. PAs, proanthocyanidins. In (B) Moondust (up) and Moonshadow (down) transgenic carnations produced by Florigene/Suntory; (C) Petunia ph6 unstable mutant (transposon insertion) in the hybrid W138xR153 background. The red spots and sectors are due to PH6 reversion. In (D) transgenic tomato fruits from plants expressing the 35S:SlANT1 construct. Immature green and red ripe fruits with anthocyanin-rich sectors in the peel; (E) green tomatoes from the same plants as in (D) showing purple flesh, locular cavities and seeds. (F) Orange and purple carrots. In (G), (H), and (I) ancient varieties of Rosaceae species with anthocyanin-rich flesh. These fruits are locally known as: (G) "mela rossa dentro incarnato" (apple variety), (H) "pera cocomerina" (pear variety) and (I) "pesca sanguinella" (peach variety).

pleiotropic effects. Factors affecting pigment production more indirectly, like hormones, sugar concentration (Loreti et al., 2008; Zhou et al., 2009) or high light and cold (Lotkowska et al., 2015; Zhang et al., 2015a), usually have dramatic side effects on the plant physiology.

The picture of anthocyanin synthesis and regulation gained from studies in flowers was confirmed in several crops where homolog MBW complexes regulate pigment accumulation in different plant parts.

Modern crops are the result of a domestication process that, for most species, went on for the last 10.000 years. Selection resulted sometimes in the loss of pigmentation in some plant parts. Pigmentation in tomato fruits, for example, was probably a trait indirectly counter-selected by breeding as the fruits of several closely related wild Solanum species are colored. The introgression in domesticated tomato of two loci, Aft (Anthocyanin fruit) and atv (atroviolacea) from wild Solanum, results in the accumulation of anthocyanins in the epidermis and the pericarp of the fruit (Povero et al., 2011), indicating that it is possible to restore fruit pigmentation by adding few genes. In fact, ectopic expression of any of the R2R3-MYB genes SlAN2 and SlANT1 (Kiferle et al., 2015) is sufficient to get purple tomatoes (**Figures 1D,E**). As SlAN2 and SlANT1 proteins activate the whole biosynthetic pathway and stress can activate SlAN2 transcription, lack of pigmentation in cultivated tomato fruits is not due to mutations in enzyme encoding genes or to

loss of function of one of the two MYBs. Rather, changes in the regulation of the MYBs, resulted in inactivity in fruits. Expression of DELILA and ROSEA1 (respectively a bHLH and a MYB) from snapdragon result in intensely purple tomatoes fruits which have health-promoting effects in a mouse model (Butelli et al., 2008). High expression of a different type of MYB (MYB12) in tomato stimulates the production of complex mixtures of flavonoids, by reprogramming primary metabolism toward the production of substrates for the phenylpropanoid pathway. The combination of MYB12 and transcription factors specific for the anthocyanin pathway further boosts anthocyanin production (Zhang et al., 2015c).

MYB genes are also responsible for pigmentation also in grape berries (Kobayashi et al., 2002; Walker et al., 2007), blood oranges (Butelli et al., 2012), apples and pears (Takos et al., 2006; Ban et al., 2007; Yuan et al., 2014). Some apple genotypes show red flesh and share a single ancestor, the Malus sieversii f. niedzwetzkyana wild apple native of Central Asia (Harris et al., 2002). The expression pattern of MdMYB10 in red flesh apples correlates with anthocyanin gene expression (Espley et al., 2007), and a minisatellite-like structure in its promoter increases MdMYB10 transcription and the accumulation of anthocyanin in leaves, flowers, and fruit cortex (Espley et al., 2009). Max Red Bartlett, a red-skinned European pear variety, gives occasionally greenskinned fruits in which PcMYB10 expression is silenced due to the methylation of two regions in its promoter (Wang et al., 2013).

The purple cauliflower (Brassica oleracea var. botrytis) originates from a spontaneous mutant found in a cauliflower field over 20 years ago (Chiu et al., 2010). This mutation results in upregulation of transcription of the Pr gene encoding for a MYB. Purple varieties are also known for carrots (**Figure 1F**), onions and potato (De Jong et al., 2004). Several more examples could be added to this list, showing that MYBs are indeed "master regulators" of anthocyanin biosynthesis and their expression pattern determines pigmentation patterns in plants.

The market request of high anthocyanin content food, led to the rediscovery of pigment-rich varieties, which were nearly forgotten. These ancient varieties of apples, pears and peaches (**Figures 1G–I**) are still poorly studied, but are a priceless source of interesting alleles to be introduced into market varieties.

Selection in agriculture probably favors mutations in MYB genes, over mutations in their bHLH and WD40 partners or in structural genes, because they are the least pleiotropic and because gain of function mutations are more likely to activate anthocyanin synthesis in new tissues. Strategies for improving anthocyanin production in crops by both breeding and genetic engineering mimics natural selection, acting on MYBs to tune the expression of anthocyanin structural genes.

#### HIGHER PRODUCTION NOT ALWAYS MEANS HIGHER YIELD, AT LEAST FOR ANTHOCYANINS

There are now sufficient tools to improve pigment production and color displayed by fruits and flowers. However, we have little understanding of the role played by degradation of anthocyanins on the total yield in fruits and on color in flowers.

It is often taken for granted that anthocyanins, once accumulated in the vacuole, are stable. However, few studies describe anthocyanin turn over and addressed whether this is due to enzymatic activity, spontaneous reactions or a combination of both (Oren-Shamir, 2009).

Color fading is reported for several species and here we briefly summarize illustrative examples reported in literature and/or known from everyday life.

In some plants, anthocyanins protect the photosynthetic apparatus from light damage in young leaves, and are lost later in development, enabling more light to enter the tissues (Steyn et al., 2002, 2004; Nissim-Levi et al., 2003). Instead, apple and pear peels show changes in pigmentation in response to temperature and/or light (**Figures 2D–F**; Steyn et al., 2004, 2009). In blood oranges, on the other end, anthocyanin content reaches a maximum in the fully ripe fruit, to decreases at latter stages when β-D-Glucosidase activity increases giving the formation of aglycons which are possible substrates for degradation by polyphenol oxidase, abundant in these fruits (Barbagallo et al., 2007). Polyphenol oxidases are also suspected to induce fading together with peroxidases in litchi fruits (Reichel et al., 2011) where an anthocyanin degradation enzyme (ADE) was identified as vacuolar laccase secreted to the extracellular space at pericarp browning (Fang et al., 2015).

Flowers turned out to be an excellent model to study color fading, which is observed for instance, in peony (**Figure 2A**), Hibiscus, orchids (Burg and Dijkman, 1967; Zhao et al., 2012; Shimokawa et al., 2015), dahlias (**Figure 2B**) and several Solanum species (**Figure 2C**). In commercial varieties of flowers, fading strongly affects the market value. One of these is aster, where the inhibition of color fading by magnesium is suggested to come from the formation of pigment-metal complexes (Shaked-Sachray et al., 2002). Although similar results were reported for grape cell suspensions (Sinilal et al., 2011), there is no direct evidence for the presence of metalloanthocyanin in these species.

Also the petals of Brunfelsia calycina, a Solanaceae shrub, fade from blue to complete white within few days after flower opening (Vaknin et al., 2005). Protein and mRNA synthesis inhibitors prevent anthocyanin degradation in these petals suggesting that fading is an active process. Interestingly, cytokinin treatment delays petal senescence but not anthocyanin degradation, suggesting that fading is independent from petal senescence and the accompanying increase in pH. Peroxidase activity correlates in time with anthocyanin degradation and recently, Zipor et al. (2015) characterized a candidate vacuolar peroxidase, BcPrx01, which transcript and protein level increase during fading. Furthermore, total protein extracts from brunfelsia petals induce in vitro fading of anthocyanins with different decorations extracted from petunia petals after addition of H2O2, suggesting a not substrate-specific mechanism. However, direct evidence that this in vivo reaction mimics the degradation seen in vivo is currently lacking.

The color of anthocyanins is affected by the pH of the vacuolar lumen where they accumulate. A strongly acidic lumen results in red, and a less acidic one in blue. In Petunia,

opening. In (B) the same dahlia flower photographed at different days (between the first and the last pictures are about 2 weeks). (C) Solamun wrightii flowers photographed on the plant. The young buds and just open flowers are intensely pigmented, while older flowers are totally white indicating strong fading of the anthocyanin pigments. (D) Color change in red (upper row) and green (bottom row) 'Cripps Pink' apples exposed to moderate light at 10, 20, and 30◦C for 6 days. The green apples accumulated anthocyanin at 20◦C while the red apples loose anthocyanin at 30◦C (Steyn et al., 2004). (E) Bleaching of red color (upper row) at the sun-exposed side of Rosemarie fruits compared to fruits receiving less intense light (bottom row) which maintain more intense pigmentation. (F) Rosemarie pears: one fruit has lost its red color and turned yellow on the tree. This phenomenon is reported for Rosemarie pears fruits that bent over during development resulting in pinching of the peduncle. (G) Petunia flower photographed at different moments after opening and showing strong color fading. This is a ph4 mutant line in a FADING background accumulating malvidin. (H) Scheme summarizing our present understanding of color fading in plant cells. Similar transcription factor complexes consisting of MYB, bHLH, WD40 and WRKY factors control anthocyanin biosynthesis (through the transcription of the structural genes encoding for the enzymes of the pathway) and vacuolar acidification (through the transcription of the two pumps PH1 and PH5). Anthocyanin are sequestered to the vacuolar lumen. When the anthocyanin molecules are highly decorated and a dominant allele of the FADING (FA) gene is present, color fading takes place as consequence of anthocyanin degradation, probably in the vacuolar lumen. This mechanism is blocked by the activity of the MBW complex indicating that target genes of these transcription factors might protect anthocyanins from the effect of FA.

blue flowering mutants define the loci PH1 to PH7 (Koes et al., 2005) which control vacuolar acidification in petals. PH1 and PH5 encode a heteromeric proton pump, transcriptionally controlled by the AN1-PH4-AN11-PH3 complex (a bHLH, a MYB, a WDR and a WRKY transcription factors) sharing components with the MBW complex regulating anthocyanin biosynthesis. Thus, pigment synthesis and vacuolar acidification are controlled by the same regulatory network (Verweij et al., 2008; Faraco et al., 2014). In petunia ph3, ph4 and ph6 mutants that contain the dominant allele of the FADING (FA) locus (de Vlaming et al., 1982, 1983), nearly complete degradation of anthocyanin occurs after flower opening (**Figure 2G**). This

process is restricted to the flower limb, while flower tube and pollen maintain their full color. Color fading is in petunia much stronger for highly substituted anthocyanins, such as 3-rutinosido(p-coumaroyl)-5glucoside anthocyanins, whereas 3 glucosides and 3-rutinosides only weakly fade and anthocyanin methylation has no effect (de Vlaming et al., 1982). As reported above, in vitro, brunfelsia protein extracts equally destabilize differently substituted anthocyanins from petunia (Zipor et al., 2015). This discrepancy might have different explanations: (i) the FA gene product has specificity for highly substituted anthocyanin molecules and the in vitro reaction does not reflect the one in vivo, (ii) the specificity of the fading mechanism in brunfelsia is different from the one in petunia, or (iii) FA activity is dependent on genes that genetically linked with genes determining the anthocyanin sunstituion patter, such as Rhamnosyl Transferase (RT) and Glucosylation at Five (GF; Quattrocchio et al., 2006).

Our limited understanding of the mechanism of anthocyanin fading, coming from experiments in brunfelsia and genetic analysis in petunia, is summarized in **Figure 2H**. Decorated anthocyanin molecules are synthesized under the control of the MBW transcription complex and transported to the vacuolar lumen where their color is affected by the pH of the environment. This is determined by the PH1/PH5 pump which expression is also regulated by the MBW complex. In the vacuole, peroxidases modulate the concentration of free radicals and water peroxide, which can affect anthocyanin stability. Under these conditions anthocyanins are relatively stable (also in the presence of the FA allele), as compared to mutants for the MBW complex which are depleted in expression of all its target genes.

On the contrary of what suggested elsewhere (Oren-Shamir, 2009), fading in petunia is not merely a change in color due to high vacuolar pH in mutants. Mutations in PH1 and PH5 increase vacuolar pH in the same extent than mutations in the MBW complex, but are not accompanied by color loss (Verweij et al., 2008; Faraco et al., 2014). Fading in ph4, ph3, and ph6 must therefore be due to down-regulation of other target genes of the MBW complex (Quattrocchio et al., 2006) in combination with the presence of a dominant FADING allele (de Vlaming et al., 1982).

The MBW complex controls several genes encoding enzymes of the anthocyanin pathway (Quattrocchio et al., 1998; Spelt et al., 2000), PH1 and PH5 (Verweij et al., 2008; Faraco et al., 2014) and at least 10 others of unknown function (Verweij et al., 2008). Which of these genes protect anthocyanins from the action of FADING can only be speculated. Their characterization via loss and gain of function study will shed light on this point, and will unravel which cellular mechanism protects anthocyanins from massive degradation.

The occurrence of fading obviously affects the final yield of anthocyanins diminishing the effect of synthesis improvement achieved by breeding or transgenesis (e.g., by modulation of the expression of MYB regulators). For this reason, the identification of the factors controlling fading of pigments as well as its inhibition, opens possibilities of further improvement of the content of these compounds in the final plant products.

#### CONCLUSION

Anthocyanin-rich plants produced by traditional breeding or biotechnology, could contribute to human health reducing the incidence of major diseases (Martin et al., 2011), while new flower colors and patterns (Yoshida et al., 2009; Tanaka and Brugliera, 2013; Zhao and Tao, 2015) are interesting for the ornamental market. Success was booked in producing plants with enhanced anthocyanin synthesis by increasing the expression of MYB factors that activate transcription of structural anthocyanin genes. However, degradation also contributes to the final anthocyanin yield in plant products making the understanding of this phenomenon important for future strategies of crop improvement.

Studies in brunfelsia provide insight into the biochemistry of anthocyanin degradation (Zipor et al., 2015).

It is unclear whether a certain degree of anthocyanin degradation, is functional to the plant. So far only speculations are possible. Anthocyanins protect tissues from free radicals and in some species accumulate in seedlings where they shield the photosynthetic machinery from light. Their degradation later in development probably improves photosynthesis (Gould et al., 2002b). In brunfelsia, anthocyanin degradation in flowers is accompanied by release of fragrant volatiles and both processes could be signals for pollinators (Zipor et al., 2015). However, no evidence is available for correlations between the two phenomena. Reactive oxygen species (ROS) formed in aging flowers or maturing fruits from photooxidation, photorespiration, and Mehler reaction, could induce anthocyanin degradation and this might protect other cellular components from damages (Hernández et al., 2009). Moreover anthocyanins inhibit Fenton hydroxyl radical generation by scavenging superoxide and hydrogen peroxide (Gould et al., 2002a, 2010). A better characterization of the genes/factors involved in color fading will answer to the many questions we presented here and open the possibility to 'design' plant cells with stable vacuolar content. Mutants makes it possible to approach the characterization of the FADING locus and of the MBW target genes involved in anthocyanin stabilization. Considering that anthocyanins are not stable outside the vacuole (Mueller et al., 2000), the MBW complex could control vacuolar physiology and mutants might have vacuolar defect resulting in anthocyanin leakage. Factors involved in both fading and its prevention could function in totally unrelated pathways. Their participation in massive anthocyanin degradation might be a peculiarity of rare genotypes that amplify a moderate pigment loss normally occurring after vacuolar accumulation.

Genetic analyses in species, like petunia, where well-defined mutants affecting this phenomenon are available (de Vlaming et al., 1982, 1983; Quattrocchio et al., 2006) open the way to identify the genes that determine anthocyanin trun-over in vivo, to assess whether complete disappearance of color is an "accident" originating from human selection during crop domestication, and to gain tools to improve stabilization of anthocyanin (and possibly also other products) in the vacuolar lumen.

#### AUTHOR CONTRIBUTIONS

fpls-07-00153 February 13, 2016 Time: 16:3 # 7

VP has searched the literature, collected the photographic information and written the manuscript. FQ and RK have conceived the idea and helped with writing the manuscript.

#### FUNDING

VP is supported by a grant of the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Project Number: 824.14.024).

#### REFERENCES


#### ACKNOWLEDGMENTS

The authors are grateful to Dr. Isabella Dalla Ragione, president of Archeologia Arborea Foundation (non-profit organization) for the safeguard and conservation of the vegetal genetic inheritance (Perugia, Italy) and for providing the pictures in **Figures 1G–I**. Pictures in **Figures 2D–F** were kindly provided by Dr. Wiehann Steyn, Department of Horticultural Science, University of Stellenbosch, South Africa. The picture in **Figure 2B** was kindly provided by Vivina Morea.

P-ATPases in the tonoplast determines flower color. Cell Rep. 6, 32–43. doi: 10.1016/j.celrep.2013.12.009


cross-over studies. J. Agric. Food Chem. 62, 11180–11189. doi: 10.1021/jf50 3689c


the anthocyanin pathway. Plant Cell 18, 1274–1291. doi: 10.1105/tpc.105. 034041



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Passeri, Koes and Quattrocchio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Encapsulation of Hemagglutinin in Protein Bodies Achieves a Stronger Immune Response in Mice than the Soluble Antigen

Anna Hofbauer<sup>1</sup> , Stanislav Melnik<sup>1</sup> , Marc Tschofen<sup>1</sup> , Elsa Arcalis<sup>1</sup> , Hoang T. Phan<sup>2</sup> , Ulrike Gresch<sup>2</sup> , Johannes Lampel<sup>1</sup> , Udo Conrad<sup>2</sup> and Eva Stoger<sup>1</sup> \*

<sup>1</sup> Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences, Vienna, Austria, <sup>2</sup> Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany

Zein is a water-insoluble polymer from maize seeds that has been widely used to produce carrier particles for the delivery of therapeutic molecules. We encapsulated a recombinant model vaccine antigen in newly formed zein bodies in planta by generating a fusion construct comprising the ectodomain of hemagglutinin subtype 5 and the N-terminal part of γ-zein. The chimeric protein was transiently produced in tobacco leaves, and H5-containing protein bodies (PBs) were used to immunize mice. An immune response was achieved in all mice treated with H5-zein, even at low doses. The fusion to zein markedly enhanced the IgG response compared the soluble H5 control, and the effect was similar to a commercial adjuvant. The co-administration of adjuvants with the H5-zein bodies did not enhance the immune response any further, suggesting that the zein portion itself mediates an adjuvant effect. While the zein portion used to induce protein body formation was only weakly immunogenic, our results indicate that zein-induced PBs are promising production and delivery vehicles for subunit vaccines.

Keywords: protein bodies, molecular farming, subcellular targeting, recombinant protein, recombinant vaccine

#### INTRODUCTION

Polymers are widely used as carrier biomaterials for the delivery of therapeutic molecules (Petros and DeSimone, 2010). In particular, biopolymer-based nanoparticles have proven suitable for clinical applications due to their biocompatibility and biodegradability (Panyam and Labhasetwar, 2003; Nitta and Numata, 2013). A variety of materials and preparation methods have been developed for application-specific properties in terms of particle shape, surface charge, and surface features (Petros and DeSimone, 2010). Among the protein-based biopolymers, those derived from natural proteins such as silk, collagen, elastin, and fibronectin have been studied in detail (Ruszczak and Friess, 2003; Daamen et al., 2007; Lammel et al., 2010; Nitta and Numata, 2013).

Zein, a protein-based polymer found in maize seeds, has been widely used as a carrier because of favorable properties such as biocompatibility, insolubility and low water uptake, mechanical and chemical stability, and its propensity to form coatings and microparticles (Liu et al., 2005; Lai and Guo, 2011; Wang et al., 2011; Lau et al., 2013). Zein is also generally regarded as safe (GRAS) for food use and resists digestion, making it particularly suitable as an encapsulation polymer for oral drugs (Hurtado-Lopez and Murdan, 2006b; Gong et al., 2011; Lau et al., 2013; Zou and Gu, 2013;

#### Edited by:

Domenico De Martinis, ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### Reviewed by:

Arun Kumar, GlaxoSmithKline Vaccines, Italy Florian Krammer, Mount Sinai Hospital, USA

#### \*Correspondence:

Eva Stoger eva.stoger@boku.ac.at.

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 15 November 2015 Accepted: 27 January 2016 Published: 16 February 2016

#### Citation:

Hofbauer A, Melnik S, Tschofen M, Arcalis E, Phan HT, Gresch U, Lampel J, Conrad U and Stoger E (2016) The Encapsulation of Hemagglutinin in Protein Bodies Achieves a Stronger Immune Response in Mice than the Soluble Antigen. Front. Plant Sci. 7:142. doi: 10.3389/fpls.2016.00142

Ahmed et al., 2015). The intravenous delivery of drug-loaded zein-based microparticles has also been investigated as a means to achieve long-acting effects such as the slow and sustained release of pharmaceutical compounds (Lai and Guo, 2011) and more efficient drug delivery to cancer cells (Lin et al., 2011; Podaralla et al., 2012; Lohcharoenkal et al., 2014). Zein-based microspheres may also provide adjuvant effects when used as vaccine carriers (Hurtado-Lopez and Murdan, 2006a).

The in vitro loading of zein-based microparticles with drugs usually involves spray or freeze drying or liquid–liquid dispersion methods (Zhong and Jin, 2009; Podaralla and Perumal, 2010; Podaralla et al., 2012; Zou and Gu, 2013). These technical processes are expensive and can affect the activity of the encapsulated agent, e.g., the high temperatures required for spray drying are incompatible with many pharmaceutical proteins.

It is therefore appealing to use plants to achieve microencapsulation in vivo by directly incorporating recombinant proteins into naturally occurring protein storage organelles such as zein bodies (Hofbauer and Stoger, 2013). Endogenous protein storage organelles are usually found in plant seeds, and zein-like prolamins are characteristic features of cereal endosperm cells. In production systems based on cereal seeds, the recombinant protein is often targeted to accumulate in prolamin-containing storage organelles that provide a protective environment, even offering some resistance against proteolytic digestion in simulated gastric fluids (Takaiwa et al., 2015).

Instead of using the natural prolamin bodies that are formed in rice, wheat, maize or barley endosperm, it is also possible to fuse the recombinant protein to assembly sequences that induce analogous structures in tissues such as leaves, which usually lack protein storage organelles. This ectopic protein body technology bypasses the longer generation time required to produce cereal seeds while still offering the advantages of natural bioencapsulation. Sequences that share the ability to trigger the formation of ectopic protein bodies (PBs) include those derived from cereal prolamins, synthetic elastin-like peptides (ELPs) and fungal hydrophobins (Floss et al., 2008; Conley et al., 2009; Torrent et al., 2009b; Gutierrez et al., 2013; Shigemitsu et al., 2013).

One of the most widely used assembly sequences comprises the N-terminal part of the mature 27 kD γ-zein protein, a member of the major prolamin-type storage protein family in maize (Shewry and Halford, 2002). Unlike other assembly sequences, it not only induces the formation of PBs but also acts as a retention sequence that stops fusion proteins from leaving the endoplasmic reticulum (ER). Consequently, the induced PBs bud from the ER as distinct round structures, underscoring the intrinsic compartment-forming properties of the zein sequence in the absence of tissue-specific factors (Mainieri et al., 2004; Llop-Tous et al., 2010). The N-terminal sequence of the 27 kD γ-zein protein comprises two cysteine residues downstream of the signal peptide, a repeated proline-rich domain forming an amphipathic helix, and a third section that includes four additional cysteine residues (Geli et al., 1994). Several reports have confirmed that zein-derived sequences induce ectopic PBs when appended to either the N-terminus or the C-terminus of diverse recombinant proteins, including phaseolin (Mainieri et al., 2004), enhanced cyan fluorescent protein (Llop-Tous et al., 2010), xylanase (Llop-Tous et al., 2011), DsRed (Joseph et al., 2012), and the Human papillomavirus E7 protein (Whitehead et al., 2014). Moreover, the ability to induce PBs appears to be almost entirely intrinsic and independent of other host-specific factors, thus allowing the formation of ectopic PBs in fungal, insect and mammalian cells (Torrent et al., 2009a), and the budding of PBs from ectopic membranes such as the plastid envelope, when combined with alternative subcellular targeting strategies (Hofbauer et al., 2014).

Hemagglutinin is an abundant type I integral membrane glycoprotein found on the envelope of influenza viruses and it has been widely used in influenza vaccine development and as a model antigen. The precursor protein HA0 yields two chains, i.e., HA1 (∼36 kDa) and HA2 (∼28 kDa), following cleavage at the motif Q/E-X-R. Infectivity requires both chains to be glycosylated, and also relies on the cleavage of hemagglutinin by a protease at multiple arginine residues. (Klenk and Garten, 1994; Hulse et al., 2004). The cleavage products are then covalently linked by a disulfide bond and these HA1/HA2 units form noncovalent homotrimers (Wiley et al., 1977).

A transmembrane domain is found near the C-terminus of HA2. The three-dimensional structure of hemagglutinin reveals two domains: a stem, responsible for membrane anchoring (part of HA1 and all of HA2), and a globular head (only HA1), bearing the sialic acid receptor binding domains (RBDs) (Wilson et al., 1981). In this study we generated a fusion construct comprising the ectodomain of hemagglutinin subtype 5 and the N-terminal part of γ-zein (amino acids 4–93) in order to induce the storage of the recombinant fusion protein inside newly formed PBs. The chimeric protein was transiently produced in tobacco leaves and H5-containing PBs were used to immunize mice. The resulting immune response was compared to that of control groups administered with the soluble H5 antigen, with or without adjuvant.

# MATERIALS AND METHODS

#### Vector Constructs

All cloning steps were carried out using the binary vector pTRA, a derivative of pPAM (GenBank AY027531). The sequence corresponding to the H5 ectodomain (amino acids 17–520) of hemagglutinin from the A/Hatay/2004/(H5N1) influenza strain (GenBank Q5QQ29) was amplified as described (Phan et al., 2014) and a plant codon-optimized signal peptide sequence derived from a murine antibody was added to the N-terminus to direct the protein into the secretory pathway (Vaquero et al., 1999). Amino acids 4–93 of the mature 27 kD γ-zein protein (lacking the signal peptide) were joined to the C-terminus via a (GGGS)<sup>2</sup> linker as previously described for the phaseolin fusion construct zeolin (Mainieri et al., 2004). A His6-tag was added to the C-terminus for detection. The final expression vector "H5-Zein" was produced by transferring this coding sequence to the pTRA vector between the Tobacco etch virus (TEV) 5<sup>0</sup> -untranslated region and the Cauliflower

mosaic virus (CaMV) 35S terminator. The expression construct was thus placed under the control of the CaMV 35S promoter with a duplicated transcriptional enhancer. An analogous construct comprising only the H5 ectodomain, a His6-tag and a C-terminal KDEL sequence was used to produce the soluble H5 antigen.

#### Plant Material

Tobacco (Nicotiana benthamiana) plants were cultivated in soil in a growth chamber with a 16-h photoperiod, 26/16◦C day/night temperatures and 70% relative humidity for 2 months.

#### Agroinfiltration of Tobacco Leaves

The expression constructs were transferred by electroporation into competent Agrobacterium tumefaciens (GV3101) cells. The bacteria were kept as a glycerol stock and used to inoculate 5-ml aliquots of YEB medium containing 25 mg/l kanamycin, 25 mg/l rifampicin, and 50 mg/l carbenicillin. The cultures were incubated for 2 days at 28◦C, shaking at 180 rpm. Each culture was mixed 1:1 with a culture containing a silencing inhibitor (HcPro) before adjusting with 2x infiltration medium (100 g/l sucrose, 3.6 g/l glucose, 8.6 g/l MS salts, pH 5.6) to an OD<sup>600</sup> of ∼1.0. After adding 200 µM acetosyringone, N. benthamiana leaves were infiltrated using a syringe (for smallscale expression) or vacuum (for large-scale expression). Young plants were completely submerged in the suspension and vacuum was applied for 2 min. The infiltrated leaves were harvested 7 days post-infiltration (DPI).

#### Protein Purification Soluble H5: Immobilized Metal Affinity Chromatography (IMAC)

Frozen leaf powder was mixed at a ratio of 1:2 (w/w) with cold lysis buffer (50 mM sodium phosphate buffer, pH 8.0, 300 mM NaCl, 5 mM imidazole, 0.5 mM PMSF) and sonicated briefly to induce further cell lysis. After 2 h, the suspension was centrifuged at 9000 rpm for 20 min and the supernatant was passed through a 1-µm filter. The pH was re-adjusted to 8.0 and the suspension was centrifuged as above. The supernatant was passed through a 0.45-µm filter before mixing with Ni-IDA IMAC resin (BioRad, Munich, Germany). Approximately 2 ml of 50% resin suspension was added per 50 ml supernatant. After incubation for 1 h, the resin was loaded onto a column and washed with eight volumes of wash buffer (50 mM sodium phosphate buffer, pH 8.0, 300 mM NaCl, 10 mM imidazole). The protein was eluted with elution buffer (50 mM sodium phosphate buffer, pH 8.0, 300 mM NaCl, 250 mM imidazole). The amount of protein in each fraction was determined using the Bradford assay before immunoblot analysis.

#### Protein Bodies (Density Gradient Centrifugation)

The frozen leaf powder was mixed 1:1 (w/v) with extraction buffer (10 mM Tris-HCl, 0.4 M sucrose, pH 7.5) and incubated overnight at 4◦C with constant shaking. The homogenate was passed through two layers of miracloth to remove solid debris and then loaded on a discontinuous sucrose gradient (3, 2.5, 2, 1.5, and 1 M in 10 mM Tris-HCl, pH 7.5). This preparation was centrifuged at 30,000 rpm and 4◦C for 3 h in a Beckman ultracentrifuge (SW 41Ti or SW32TI rotor). After separation, 500-µl fractions were collected for analysis by SDS-PAGE and immunoblotting.

#### Protein PAGE and Immunoblot Analysis

Infiltrated leaves (7 DPI) were harvested and ground in liquid nitrogen to a fine powder. We then extracted 60 mg of leaf powder in 200 µl buffer K (62.5 mM Tris, pH 7.4, 10% glycine, 5% 2-mercaptoethanol, 2% SDS, 8 M urea). Ten microliter of the extract were mixed with loading buffer and boiled for 10 min before loading. Samples collected from the density gradient and IMAC procedures were mixed with 5x loading buffer, boiled at 100◦C for 10 min and separated by reducing SDS-PAGE (12% polyacrylamide gel, 200 V for 90 min). Gels were stained with Coomassie Blue or transferred to a nitrocellulose membrane. The membrane was blocked with 5% (w/v) skimmed milk in phosphate buffered saline (PBS) for 1 h and then incubated with a mouse anti-poly-histidine antibody (Sigma-Aldrich Chemie GmbH, Germany) at room temperature for 2 h, diluted 1:10000. The blot was washed three times in PBS plus 0.05% Tween-20 (PBST) and then incubated for 1 h with the secondary anti-mouse alkaline phosphataseconjugated antibody (diluted 1:5000). The membrane was washed another three times with PBST and the signal was detected using the NBT/BCIP system. For quantitation, the samples were compared to serial dilutions of a His6-tagged standard protein, and the images were analyzed using BioRad Image Lab v5.1.

#### Fluorescence Microscopy

Infiltrated leaves were cut into small pieces with a razor blade and fixed in 4% (w/v) paraformaldehyde plus 0.5% (v/v) glutaraldehyde in 0.1 M phosphate buffer (pH 7.4) at 4◦C overnight. For immunolocalization by confocal microscopy, vibratome sections were mounted on a glass slide, blocked with 5% (w/v) bovine serum albumin (BSA) in 0.1 M phosphate buffer (pH 7.4) and incubated with a polyclonal antibody against 27 kD γ-zein. The samples were then incubated with an AlexaFluorr488-conjugated secondary antibody and observed under a Leica SP5 confocal laser scanning microscope (CLSM).

#### Immunization of Mice

Male BL6 C57/Bacl6J mice (6–8 weeks old) obtained from Charles River Laboratories, Research Models and Services were assigned to seven groups (n = 10). Immunization was carried out by the subcutaneous injection of 150 or 300 ng of H5-zein either with or without Freund's adjuvant (1:1) (Difco Laboratories, Detroit, Michigan). For the primary immunization, complete Freund's adjuvant was used where indicated. Booster immunizations consisted of two additional injections of the same antigen (with or without incomplete Freund's adjuvant). As controls, one group received PBS with Freund's adjuvant only, and two groups were injected three times with soluble H5 (15 µg), with or without adjuvant. After the third immunization, the mice were retro-orbitally

bled and serum samples were collected for individual testing. A second set of blood samples was taken 8 weeks after the primary immunization and the mice were sacrificed immediately afterward.

The animal experiments were approved by the Landesverwaltungsamt Sachsen-Anhalt, Halle/Saale, Referat Verbraucherschutz, Veterinärangelegenheiten and by the Landkreis Harz, Amt für Veterinärwesen und Lebensmittelüberwachung, Halberstadt. All animals received humane care according to the requirements of the German Animal Welfare Act, §8 Abs. 1.

#### IgG Quantitation by ELISA

The wells of a flat-bottom microtiter plate were coated with 0.2 µg per well of the antigen (purified recombinant H5 or zein (Sigma–Aldrich Chemie GmbH, Germany)). Then 100 µl of diluted serum (1:250 in PBS with 3% BSA) was added to each well and incubated at room temperature for 1.5 h. Rabbit anti-mouse IgG alkaline phosphatase-conjugated antibody (diluted 1:2000 in PBST) was used for detection. The IgG titer was determined by adding immune serum samples as serial dilutions starting at 1:1000. Curve fitting by five-parameter logistic regression was used to calculate the endpoint titer for each mouse. End-point titers were determined as the reciprocal highest serum dilutions that produced mean optical density values two-fold greater than the geometric mean of those from the negative control (injected with PBS) sera. The statistical significance was determined using Student's t-test (∗∗p < 0.01; <sup>∗</sup>p < 0.1).

Selected ELISA experiments were carried out using recombinant H5 purified via an additional size exclusion chromatography step. The same results were obtained, indicating that the H5 preparation purified via IMAC was sufficiently pure for coating, and did not contain significant amounts of immunoreactive impurities.

#### Hemagglutination Inhibition (HI) Test

HI tests were carried out as described by Phan et al. (2013). Briefly, a 25-µl aliquot of murine serum was mixed with 25 µl PBS and added to the first well of a V-bottom microtiter plate. Twofold serial dilutions were prepared across the row of 12 wells. Aliquots (25 µl) containing 4HAU of inactivated virus [A/swan/Germany/R65/2006(H5N1)] were added to each well and incubated for 30 min at room temperature. We then pipetted 25 µl of a 1% red blood cells (RBCs) suspension into each well and the plate was again incubated for 30 min at room temperature. The HI titer was defined as the reciprocal of the highest serum dilution that achieved the complete inhibition of hemagglutination.

#### RESULTS

#### Hemagglutinin-Zein Fusions form PBs in N. benthamiana Leaves

Expression vector "H5-Zein" containing the sequence corresponding to the H5 ectodomain of hemagglutinin,

fused to amino acids 4–93 of the mature 27 kD γ-zein protein, was introduced into N. benthamiana leaves by agroinfiltration. Immunoblot analysis of extracts from the infiltrated leaves 7 DPI revealed the presence of a band corresponding to the fusion protein (**Figure 1A**). The higher than predicted molecular mass probably reflected the glycosylation of H5, as previously reported (Phan et al., 2014). The fusion of H5 to zein resulted in the formation of PBs, whose presence was confirmed by immunofluorescence microscopy. All labeling was concentrated in the PBs, whereas no signal was detected in the ER lumen or in the apoplast (**Figure 2**). This result was confirmed by the density step gradient centrifugation of leaf homogenates (**Figure 1B**). The gradient fractions were collected and tested by immunoblot analysis. No recombinant fusion protein was found in fractions from the top of the gradient, a small amount was present in the highest density fractions, and the majority of the fusion protein was found in the pellet (**Figure 1B**), similar to results reported with another zein fusion protein that forms PBs (Whitehead et al., 2014). Sucrose was removed by pooling the selected fractions and resuspending them in 10 mM Tris (pH 7.5) before centrifuging them under the same conditions as above. The supernatant was removed and the pellet was re-suspended in sterile PBS. The protein suspension was stored at –20◦C prior to the immunization experiments. A total of ∼120 µg H5-zein was recovered from 300 g of fresh infiltrated leaves. Soluble H5 lacking the zein fusion was expressed as a control and purified by IMAC as previously described (Phan et al., 2013). We recovered 1 mg of H5 from 500 g of fresh infiltrated leaves, and this was used as a positive control for immunization (**Figure 1C**).

#### H5-Zein PBs Elicit an Immune Response in Mice

The H5-zein protein body suspension and the soluble H5 antigen were each used to immunize mice (**Figure 3**). The

mice were allocated to seven groups (n = 10 per group) and immunization was carried out by the subcutaneous injection of 150 or 300 ng of H5-zein either with or without Freund's adjuvant. This low dosage of H5-zein bodies was chosen to confirm the hypothesis that particulate antigens are effective in small amounts. As controls, two groups were injected three times with soluble H5 (15 µg, a dose previously confirmed to provoke

autofluorescence of the chloroplasts. The right panel shows the overlay

pictures. Abbreviations: v, vacuole. Bars = 10 µm.

a strong humoral immune response), one with and one without adjuvant.

The plant-derived H5-zein protein body suspension was shown to elicit an IgG response in 100% of the animals, even at low doses. Interestingly, the addition of an adjuvant to the H5-zein bodies did not cause a significantly stronger immune response (**Figure 3**, groups 1 vs. 3 and 2 vs. 4), whereas the adjuvant had a significant impact in the control groups receiving soluble H5 (**Figure 3**, groups 5 vs. 6).

## The Zein Fusion Component is Only Weakly Immunogenic but has a Significant Adjuvant Activity

To confirm the observations summarized above, we carried out an in-depth comparison of IgG titers of groups 3, 5, and 6. The IgG response elicited by H5-zein without adjuvant (group 3) was comparable to that achieved by injecting soluble H5 combined with an adjuvant (group 5). In contrast, the administration of soluble H5 without adjuvant elicited a minimal IgG response (**Figure 4A**). This suggested that the zein component and/or the particulate nature of the protein body act as an adjuvant.

To determine whether the zein portion fused to H5 has intrinsic immunogenic properties, we investigated the IgG response directed against zein by comparing the IgG response in groups 1 (H5-zein with adjuvant), 3 (H5-zein without

FIGURE 3 | Immunization timeline and IgG responses in the seven treatment groups. (A) All mice were injected with a primary dose of H5-zein (6 µg/ml) or H5 (0.6 mg/ml), both in sterile PBS, with or without Freund's complete adjuvant. For the second and third injections, the adjuvant was switched to Freund's incomplete adjuvant. (B) ELISA analysis of the anti-H5 IgG response following the third immunization. A single dot represents the ELISA result from a single serum sample. Each treatment group comprised 10 mice.

p > 0.1).

adjuvant), and 5 (soluble H5 with adjuvant). Although we generally detected low IgG titers, 30% of the animals in groups 1 and 3 showed a clearly detectable IgG response against zein (**Figure 4B**). Overall the immune response in the treatment groups was not significantly different to that of the control group (p > 0.1).

#### HI Antibody Titers are Insignificant

All mice vaccinated with H5-zein PBs showed an immunological response so we carried out HI tests on the serum from each mouse to determine whether the induced antibodies were potentially capable of neutralizing the virus. Because of the unavailability of the A/Hatay/2004(H5N1) virus in an inactivated form, the heterologous inactivated virus strain A/swan/Germany/R65/2006(H5N1) was used for the HI assay. The deduced hemagglutinin amino acid sequence similarity of both strains is 96%, and it was previously shown that HI titres against inactivated virus A/swan/Germany/R65/2006 (H5N1) could be measured in sera from mice vaccinated with trimeric HA derived from the HA sequence corresponding to the A/Hatay/2004(H5N1) virus (Phan et al., 2013). The HI assay results indicated that the HI antibody titers were either below or marginally above the detection limit in all treatment groups (**Table 1**).



#### DISCUSSION

The expression of recombinant proteins in plants is an attractive strategy reflecting the versatility, safety, scalability, and economy of plant-based production platforms (Rybicki et al., 2013; Stoger et al., 2014). Plants also offer the possibility to accumulate recombinant pharmaceutical proteins within endogenous or ectopic protein storage organelles, which can either be derived directly from the ER or represent protein storage vacuoles (Khan et al., 2012). Here, we successfully induced the formation of ectopic PBs by fusing the H5 ectodomain of hemagglutinin to the N-terminal sequence of γ-zein. Previous studies have shown that the biogenesis of PBs by zein is influenced by the fusion partner, and that not all fusion proteins support the efficient formation of PBs. For example, phaseolin induces the efficient formation of zeolin PBs when fused to the N-terminal sequence of γ-zein (Mainieri et al., 2004). However, PBs were not formed when the Human immunodeficiency virus Nef antigen was fused to the same γ-zein sequence, but protein body formation was possible again when Nef was fused to the entire chimeric protein zeolin (de Virgilio et al., 2008).

The induction of H5-containing protein aggregates in plants has also been achieved by fusing the antigen to hydrophobin and ELPs (Phan et al., 2014). The H5-ELP PBs were approximately 800 nm in diameter whereas the H5-hydrophobin PBs were substantially smaller, with an average diameter of 250 nm. The H5-zein PBs reported herein were larger, with a diameter of 1–2 µm, which is similar to the average size of endogenous zein bodies found in maize endosperm (Lending and Larkins, 1989). In contrast to the H5-ELP and H5-hydrophobin PBs, H5 zein formed high density structures that were insoluble in nonreducing buffers, whereas H5-ELP and H5-hydrophobin fusion proteins could be extracted in 50 mM Tris, pH 8.0 (Phan et al., 2014).

The H5-zein bodies described herein were used as a delivery vehicle for a model vaccine antigen. IgG responses were elicited in all mice immunized with H5-zein but HI assays indicated the absence of neutralizing antibodies. Our results agree with previous parenteral immunization studies using monomeric hemagglutinin fused to ELP, which, in contrast to trimeric hemagglutinin, also did not induce neutralizing antibodies (Phan et al., 2013). Although the formation of PBs involves multiple cross-linking via intermolecular disulfide bonds, this type of multimerization may not be sufficient to support the specific oligomerization state that appears to be required for the formation of specific native epitopes that may confer a seroprotective immune response. Proper trimerization may be required to complete the folding of hemagglutinin monomers and to induce conformational effects necessary for full antigenicity and the induction of neutralizing antibodies (Magadan et al., 2013). The introduction of a trimerization signal in addition to the assembly sequence may therefore be beneficial, as reported for H5-ELP fusions (Phan et al., 2013).

One remarkable outcome of our experiments was that a comparable immune response was elicited in all mice despite the H5-zein concentration being 100 times lower than the concentration of soluble H5 in the control group, which was administered with a strong adjuvant. Interestingly, the administration of an adjuvant together with the H5-zein bodies did not promote a stronger immune response, suggesting that the addition of the zein portion itself mediates an adjuvant effect. This agrees with Whitehead et al. (Whitehead et al., 2014), who recently reported that the immunogenicity of a recombinant antigen was increased in the presence of Zerar, an assembly sequence that is very similar to the N-terminal part of γ-zein (Torrent et al., 2009b), and the immunogenicity of the fusion protein could not be enhanced further by the inclusion of Freund's adjuvant. Similarly, the injection of synthetic zein microspheres that were loaded with ovalbumin resulted in higher IgG responses than the 'free' soluble protein (Hurtado-Lopez and Murdan, 2006a). This strongly supports the hypothesis that the zein N-terminal portion possesses intrinsic adjuvant activity, although we cannot exclude the possibility that the observed adjuvant effect was mediated by another component of the PBs. Joseph et al. reported that zein-induced PBs isolated from leaves contain additional proteins that are trapped during biogenesis (Joseph et al., 2012).

An adjuvant effect conferred by a polymer-forming protein domain is not unexpected, given its similarity to the strategy of attaching a carrier protein such as albumin or keyhole limpet hemocyanin to antigens with poor immunogenicity (Harris and Markl, 1999). By definition, an adjuvant is characterized by its ability to enhance the immunogenic efficacy of antigens in the same formulation. This can be achieved by increasing the halflife of an antigen, improving antigen delivery to its effector sites, or providing immunostimulatory signals to enhance the immune response. The observed adjuvant effect of zein particles may reflect one or more of several relevant properties. First, hydrophobic synthetic block copolymers have been shown to confer stronger adjuvant properties than hydrophilic polymers (Newman et al., 1998; Hunter, 2002). Accordingly, the N-terminal part of γ-zein is partially hydrophobic, favoring intermolecular and membrane interactions (Kogan et al., 2002). It has also been reported that particulate antigens are transported more efficiently to murine splenic follicular dendritic cells in vivo in the absence of prior immunity, making them more immunogenic than soluble antigens (Link et al., 2012). This may also be reflected by the superior immunogenicity of hemagglutinin-containing virus-like particles compared to soluble hemagglutinin, even in the absence of an adjuvant (Shoji et al., 2015). Also, repetitive antigen display, structural, or molecular mimicry of the virus, particle-size dependent tissue penetration and trafficking to lymphatics and Toll-like receptor activation are possible mechanisms. In repetitive antigen display the spatial organization of the antigens on the particle surface facilitates B-cell receptor (antibody) co-aggregation, triggering and activation. This can support the production of longlived high-affinity neutralizing antibodies (Smith et al., 2013). Plant-derived PBs might also provide a specific spatial antigen organization favoring a successful repetitive antigen display. Alternatively, increased half life and stability of the zein fusions in the serum in vivo might be responsible for the enhanced immune response. The half-life of the antigen is likely to be extended due to encapsulation in the protein body. Indeed, pharmaceutical preparations encapsulated in zein particles in vitro remained in the blood for at least 24 h following intravenous delivery (Lai and Guo, 2011). Interestingly, the fusion of hemagglutinin to ELP repeats did not seem to increase immunogenicity although the propensity to form protein aggregates was confirmed in planta (Phan et al., 2013, 2014).

Zein has several favorable general characteristics as an adjuvant, i.e., it is stable at ambient temperatures and yet it is biodegradable, encouraging its use as a biopolymer for the coating and encapsulation of recombinant proteins such as erythropoietin (Bernstein et al., 1993). However, the potential immunogenic properties of zein must be taken into account (Hurtado-Lopez and Murdan, 2006a; Whitehead et al., 2014). We detected an immune response directed against γ-zein although the response was much weaker than that directed against H5, and when compared to the control group administered with H5 alone, the difference between the groups was not statistically significant. However, 30% of the mice injected with H5-zein showed an immune response above background levels. A significant immune response against the zein-like sequence Zera has been reported by (Whitehead et al., 2014), warranting further studies to assess the suitability of zein bodies as drug delivery vehicles for parenteral administration. Animal studies involving the injection of zeincoated erythropoietin (Bernstein et al., 1993) and ivermicin (Gong et al., 2011) did not indicate any adverse effects. Other storage proteins, including the wheat storage protein gliadin, have also been used as coatings to prepare various proteins and pharmaceuticals in vitro. GliSODin<sup>r</sup> for example is an oral treatment for oxidative stress, in which superoxide dismutase (SOD) is coated with gliadin (Cloarec et al., 2007). Although gliadin protects SOD from digestion, this storage protein is also linked to the autoimmune disorder celiac disease (Chaptal, 1957). Even so, this product has received market approval for human use.

#### CONCLUSION

fpls-07-00142 February 15, 2016 Time: 18:53 # 8

Zein and similar plant storage proteins have long been investigated as carriers for pharmaceuticals including recombinant proteins. The direct encapsulation of pharmaceutical proteins in the production host is a simple approach that is less expensive than the production of synthetic microparticles. Our case study using a model vaccine antigen indicates that zein-induced PBs can be used as vaccine delivery vehicles that benefit from a value-added adjuvant effect, whereas the intrinsic immunogenicity of the zein component is low. The insertion of a trimerization signal fused to H5 will be tested to determine whether this leads to the assembly of structures that can elicit neutralizing antibodies against H5. It will also be of value to develop the in planta protein body encapsulation strategy for the production and delivery

#### REFERENCES


of further antigens, including candidates intended for oral application.

#### AUTHOR CONTRIBUTIONS

AH designed and carried out experiments, analyzed data, and wrote the manuscript. SM designed and carried out experiments and analyzed data. MT and EA carried out experiments, analyzed data, and contributed to the manuscript. HP, UG, and JL carried out experiments and analyzed data. UC and ES designed the study, analyzed data, and wrote the manuscript.

#### ACKNOWLEDGMENTS

The authors would like to acknowledge financial support by the Austrian Science Fund FWF (W1224 and I1461-B16). We are greatful to Dr. Jutta Veits, Friedrich-Loeffler-Institut, Greifswald-Insel Riems for providing the inactivated virus.



in South Africa and Argentina. Curr. Pharm. Des. 19, 5612–5621. doi: 10.2174/1381612811319310015


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Hofbauer, Melnik, Tschofen, Arcalis, Phan, Gresch, Lampel, Conrad and Stoger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Functional Characterization of 4 ′OMT and 7OMT Genes in BIA Biosynthesis

Tugba Gurkok <sup>1</sup> , Esma Ozhuner <sup>2</sup> , Iskender Parmaksiz <sup>3</sup> , Sebahattin Özcan<sup>4</sup> , Mine Turktas <sup>2</sup> , Arif ˙Ipek <sup>2</sup> , Ibrahim Demirtas <sup>5</sup> , Sezer Okay <sup>2</sup> and Turgay Unver <sup>2</sup> \*

<sup>1</sup> Eldivan SHMYO, Department of Anesthesia, Cankiri Karatekin University, Cankiri, Turkey, <sup>2</sup> Department of Biology, Faculty of Science, Cankiri Karatekin University, Cankiri, Turkey, <sup>3</sup> Department of Molecular Biology and Genetics, Faculty of Science, Gaziosmanpasa University, Tokat, Turkey, <sup>4</sup> Department of Field Crops, Faculty of Agriculture, Ankara University, Ankara, Turkey, <sup>5</sup> Department of Chemistry, Faculty of Science, Cankiri Karatekin University, Cankiri, Turkey

Alkaloids are diverse group of secondary metabolites generally found in plants. Opium poppy (Papaver somniferum L.), the only commercial source of morphinan alkaloids, has been used as a medicinal plant since ancient times. It produces benzylisoquinoline alkaloids (BIA) including the narcotic analgesic morphine, the muscle relaxant papaverine, and the anti-cancer agent noscapine. Though BIAs play crucial roles in many biological mechanisms their steps in biosynthesis and the responsible genes remain to be revealed. In this study, expressions of 3-hydroxy-N-methylcoclaurine 4 ′–methyltransferase (4 ′OMT) and reticuline 7-O-methyltransferase (7OMT) genes were subjected to manipulation to functionally characterize their roles in BIA biosynthesis. Measurements of alkaloid accumulation were performed in leaf, stem, and capsule tissues accordingly. Suppression of 4 ′OMT expression caused reduction in the total alkaloid content in stem tissue whereas total alkaloid content was significantly induced in the capsule. Silencing of the 7OMT gene also caused repression in total alkaloid content in the stem. On the other hand, over-expression of 4 ′OMT and 7OMT resulted in higher morphine accumulation in the stem but suppressed amount in the capsule. Moreover, differential expression in several BIA synthesis genes (CNMT, TYDC, 6OMT, SAT, COR, 4′OMT, and 7OMT) were observed upon manipulation of 4 ′OMT and 7OMT expression. Upon silencing and overexpression applications, tissue specific effects of these genes were identified. Manipulation of 4 ′OMT and 7OMT genes caused differentiated accumulation of BIAs including morphine and noscapine in capsule and stem tissues.

Keywords: metabolic engineering, morphine, noscapine, overexpression, stem tissue, VIGS

# INTRODUCTION

Most plants synthesize different kinds of natural products possessing commercial value such as secondary metabolites in response to various environmental or developmental factors. Alkaloids, as a member of secondary metabolites, classified into several groups including benzylisoquinoline alkaloids (BIA), commonly used for pharmaceutical purposes (Winzer et al., 2012). The opium poppy (Papaver somniferum L.), belongs to the Papaveraceae family and has been used as a medicine or drug for a long time (Schiff, 2002). It produces a number of BIAs including the narcoticanalgesic morphine, the cough suppressant codeine, the muscle relaxant papaverine, and the

#### Edited by:

Domenico De Martinis, ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### Reviewed by:

Taras P. Pasternak, University of Freiburg, Germany Gabriel Dorado, University of Córdoba, Spain

> \*Correspondence: Turgay Unver turgayunver@gmail.com

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 02 November 2015 Accepted: 18 January 2016 Published: 16 February 2016

#### Citation:

Gurkok T, Ozhuner E, Parmaksiz I, Özcan S, Turktas M, ˙ Ipek A, Demirtas I, Okay S and Unver T (2016) Functional Characterization of 4′OMT and 7OMT Genes in BIA Biosynthesis. Front. Plant Sci. 7:98. doi: 10.3389/fpls.2016.00098

anti-microbial sanguinarine (Allen et al., 2004; Desgagné-Penix et al., 2010; Gurkok et al., 2015). Since BIA takes roles in many biological functions both in plants and animals, its biosynthesis and regulation mechanism has been an interest for researchers.

To date several genes involved in the BIA biosynthesis in opium poppy have been cloned and characterized. The BIA biosynthesis begins with the condensation of two Ltyrosine derivatives- 4′hydroxyphenylacetaldehyde (4′HPAA) and dopamine- catalyzed by tyrosine/DOPA decarboxylase (TYDC) to generate (S)-norcoclourine (Facchini and De Luca, 1994; Lee and Facchini, 2011). Different types of BIA like protoberberine, morphinan, and others share the early common steps in the biosynthetic pathway. (S)-reticuline is the central intermediate of opium BIA ramification and its formation needs a series of enzymes including norcoclaurine synthase (NCS; Lee and Facchini, 2010), norcoclaurine 6-Omethyltransferase (6OMT; Morishige et al., 2000), coclaurine Nmethyltransferase (CNMT; Choi et al., 2002), and 3-hydroxy-N-methylcoclaurine 4′ -O-methyltransferase (4′OMT; Morishige et al., 2000). Although, (R,S)-reticuline7-O-methyltransferase (7OMT), converts (S)-reticuline to (S)-laudanine (Ounaroon et al., 2003) the morphinan branch requires the epimerization of (S)-reticuline to (R)- reticuline (De-Eknamkul and Zenk, 1992). (R)-reticuline is converted into salutaridine by salutaridine synthase (SalSyn), which is then reduced by salutaridine reductase to yield salutaridinol (Ziegler et al., 2006). The following step is the conversion of salutaridinol to salutaridinol-7-O-acetate via the enzyme salutaridinol-7-O-acetyltransferase (SAT; Grothe et al., 2001). In the last step of morphine biosynthesis, the conversion of thebaine to morphine occurs via codeine or oripavine produced by the enzymes thebaine 6-O-demethylase (T6ODM), codeinone reductase (COR), and codeine O-demethylase (**Figure 1**) (CODM; Unterlinner et al., 1999; Hagel and Facchini, 2010). Albeit a lot of enzymes cloned, BIA biosynthesis, and regulation has not been fully discerned yet.

Metabolic engineering is an approach that alters metabolic pathways and/or metabolite production via gene transfer technology. Biotechnological manipulation of a plant's secondary metabolism pathways, such as gene silencing or overexpression, aids in the discovery of gene functions as well as in increasing the accumulation of the high-value plant's natural products (Nagamatsu et al., 2007; Purkayastha and Dasgupta, 2009; Bedon et al., 2010). On the other hand, Virus-induced gene silencing (VIGS) is an emerging technique to study gene functions. Since plants induce homology-dependent defense mechanisms in response to viral attacks VIGS are advantageous over other silencing approaches (Hileman et al., 2005). VIGS, leading a fast and transient suppression of gene expression, involves cloning of short sequence fragments of interested targeted gene to be silenced (Unver and Budak, 2009). Tobacco rattle viruses (TRV) used as silencing vectors which have been engineered to silence target genes in intended host plants including opium poppy (Liu et al., 2002). TRV mediated VIGS can be applied in different plant tissues to silence metabolic pathway genes. Furthermore, to investigate the regulation of morphine biosynthesis, the genes responsible for the last six steps of morphine synthesis were systemically silenced via VIGS. As a result, it was observed that by the silencing of SalSyn, SalR, T6ODM, and CODM genes, morphine content was reduced whereas salutaridine, thebaine and codeine accumulation was remarkably induced (Wijekoon and Facchini, 2012). Furthermore, some genes including 4 ′OMT and 7OMT were silenced to enlighten the papaverine biosynthesis in papaverine rich plants (Desgagné-Penix and Facchini, 2012). Winzer et al. (2012) carried out VIGS assays to reveal the genes taking part in noscapine biosynthesis. In this study, pathway intermediate accumulation led to the observation of a new gene functioning in noscapine biosynthesis as well as a novel pathway. Additionally, Allen et al. (2004) used RNA interference (RNAi) to block codeinone reductase (COR) activity, which is the first example of metabolic engineering application on opium poppy. After suppression of COR, the common intermediate alkaloid (S)-reticuline, precursor of diverse BIA, was accumulated in transgenic plants and in this study it was suggested that there is a feedback mechanism in BIA biosynthesis.

Besides gene suppression studies, overexpression assays were also conducted for various genes in BIA biosynthesis. For example, overexpression of COR1 caused increased morphine and codeine accumulation ∼22 and 58%, respectively, in opium poppy (Larkin et al., 2007). SAT gene overexpression increased the total alkaloid content by about 40% (Allen et al., 2008). Noteworthy, overexpression of NMCH raised the total alkaloid level to 450% in opium poppy (Frick et al., 2007). The overexpression of O-methyltransferases 6OMT and 4 ′OMT, the upstream enzymes of BIA pathway in California poppy Eschscholzia californica cells, suggested a rate-limiting role for 6OMT since it caused an increase in total alkaloid content. However, little effect was found to be caused by 4 ′OMT (Inui et al., 2007). These results indicate that unexpected outputs can be seen in BIA pathway because of the unidentified branches.

In the current study, to address the tissue specific functional roles of 4 ′OMT and 7OMT genes in BIA biosynthesis TRV-based VIGS, which is a useful method to silence the expression of genes in plants and transient overexpression approaches, were utilized. In order to observe the alkaloid accumulation alteration. Upon genetic manipulation in stem, capsule, and leaves of opium poppy, alkaloid levels were measured by HPLC-TOF/MS. Expression level of the genes involved in the BIA biosynthetic pathway (CNMT, TYDC, 6OMT, SAT, COR, 4 ′OMT, and 7OMT) were also quantified via qRT-PCR.

#### MATERIALS AND METHODS

#### Plant Material and Growth Conditions

Papaver somniferum cv Ofis 95, a morphine rich variety, seeds were provided from Toprak Mahsulleri Ofisi (TMO; Ankara, Turkey). These were potted in %25 peats, %25 perlite, and %50 soil mixtures, and then transferred to a growth chamber. Plants were kept with day/night cycles of 16/8 h at 20/24◦C photoperiod and 3 month-old leaves were selected for gene cloning experiments.

#### RNA Isolation and cDNA Synthesis

Collected tissues were ground to a fine powder in liquid nitrogen. RNA isolations were performed by TRIzol <sup>R</sup> Reagent (Invitrogen, Carlsbad, CA) following the manufacturer's instructions. Quantity of the isolated RNAs was measured using NanoDrop 2000c spectrophotometer (ThermoFisher Scientific, Lenexa, KS) and the integrity of RNA was checked on 2% agarose gels. RNAs were reverse transcribed using Superscript III first strand cDNA synthesis kit (Invitrogen, Life Technologies) to specifically amplify 4 ′OMT2 (Isoform 2; GenBank: AY217334.1) and 7OMT (GenBank: FJ156103.1) complementary DNA sequences.

#### Construct Preparation and Transformation

Virus Induced Gene Silencing (VIGS) was carried out using TRV-based vector system. Fragments of 4′OMT (GenBank: AY217334.1) (293 bp) and 7OMT (GenBank: FJ156103.1) (451 bp) were amplified with the primers listed in Supplementary Table 1. These primers, include regions of BamhI and SmaI restriction enzyme to be used in the next step. The amplified 4 ′OMT and 7OMT genes were then inserted into pGEM-T vector (Promega, Madison, WI, USA). Both TRV and plasmid DNA were digested with BamHI and SmaI restriction endonucleases. The cleaved products were ligated into the TRV2 (individually pTRV2-4′OMT and pTRV2-7OMT) vectors. After cloning verification by sequencing, the constructs were separately transformed into Agrobacterium tumefaciens strain LBA4404 via electroporation. A. tumefaciens colonies separately transformed with TRV1 and pTRV2 with gene of interest were selected, then were grown in 10 mL of Luria-Bertani (LB) medium containing 50 mg/L kanamycin with overnight shaking at 250 rpm 28◦C until cell absorbance reached to OD<sup>600</sup> = 2. The cell cultures were centrifuged at 2000 g for 15 min, and the pellets were resuspended in 5 mL of induction buffer, (1 mM MES, 150 mM acetosyringone, and 10 mM MgCl2), pH:5.8, until OD<sup>600</sup> reached to 0.8 for TRV2 and 0.2 for TRV1. Agrobacterium LBA4404 strains consisting TRV1 and TRV2 were then mixed in a 1:1 ratio. The mixture was used for agro-infiltration assay by needleless syringe to young leaves of opium poppy. The tip of the syringe without a needle suppressed against the underside of a leaf while synchronously applying kindly counterpressure to the other side of the leaf. Later, Agrobacterium is injected into the air space along the inside of the leaf stomata. The control plants were also agro-infiltrated with TRV2-Empty vector. The inoculation repeated two times with 7 days interval. One week after the second inoculation, the samples were harvested and stored at −80◦C until use. The number of infiltrated plants was 15 for each pTRV2-4′OMT and pTRV2-7OMT experiment and among three of them which were almost equally silenced and the empty vector selected for further analyzes.

Over-expression assays were carried out that sequences containing full lengths of 4 ′OMT and 7OMT genes cloned into pGEM-T vector before cloned into a viral-based vector pGR106 containing 35S promoter. 4 ′OMT (1077 bp) and 7OMT (1170 bp) genes were amplified by PCR and ligated into pGEM-T vector. For this, forward primers containing NotI restriction site added the flag peptides (4 ′OMT-NotI FLAG and 7OMT-NotI FLAG) and reverse primers (4-OMT-SalI R and 7-OMT-SalI R) were used for cloning (Supplementary Table 2). Following the confirmation of clonings by sequencing, the constructs were transformed into pGR106 viral-based vector. Subsequently, the constructs were transformed into LBA4404 strain of A. tumefaciens by electroporation. A. tumefaciens colonies with pGR106-4 ′OMT and pGR106-7OMT constructs were then grown in 10 mL of LB medium containing 50 mg/L kanamycin overnight shaking at 250 rpm 28◦C until OD<sup>600</sup> reached to 2.

The pelleted cultures were dissolved in an induction buffer (pH:5.6) containing 1 mM MES, 150 mM acetosyringone, and 10 mM MgCl<sup>2</sup> until the OD<sup>600</sup> reached to 0.4, and were further incubated overnight at room temperature. The agro-infiltration was applied on capsule, leaf and stem tissues. The mock control plants were also agro-infiltrated with pGR106-Empty vector 10 days after inoculation, samples were harvested and stored at −80◦C until use.

#### Quantitative RT-PCR Analysis

To identify expression levels of 4 ′OMT and 7OMT in overexpressed and silenced tissues, qRT-PCR assays were performed using LightCycler 480 Real-Time PCR System (Roche, Mannheim, Germany). Additionally, the transcript levels of some BIA biosynthesis genes such as COR, SAT, TYDC, CNMT, 6- OMT were also measured. The primers used for measurements were given in Supplementary Table 3. The qRT-PCR experiments were performed as the following; cDNA (2µl) was amplified with 10 mM of each specific forward and reverse primers and 10µl SYBR Green I Master mix (Roche Applied Science, Penzberg, Germany) in a total volume of 20µl. PCR amplification was generated as follows; preheating at 95◦C for 5 min; and 41 cycles of 95◦C for 10 s, 55–60◦C for 20 s, and an extension at 72◦C for 10 s. Three biological replicates were performed for each sample. qRT-PCR experiments were performed in triple replicates for each RNA sample/primer combinations. Gene expression levels were calculated according to the 2−11Ct method (Livak and Schmittgen, 2001). To normalize the results, 18S rRNA gene, housekeeping gene was used as reference gene (Supplementary Table 3). The melting curves templates for qRT-PCR were carried out from 57 to 95◦C, as the temperature increased at 0.5◦C per second. As a result of the melting curves were analyzed for each run and the data of the fluorescence signals, which filter out the false-positive peaks.

#### Metabolite Measurement

Plant materials were dried at 28◦C for 2 days. Dried samples (0.1 g) were soaked in methanol (HPLC grade, Merck, German) at room temperature for 1 day with shaking followed by filtration to separate the marc and evaporate. The extracts were solved in 2000 ppm and diluted to 1/200 ppm from stock. To analyze the HPLC-TOF/MS, the samples were filtered through 0.45µm membranes. Morphine, codeine, thebaine, papaverine, noscapine, and laudanosine were analyzed using specific standards. Alkaloids were quantitated in Agilent 1260 Infinity HPLC system (Agilent, Palo Alto, CA) coupled with Agilent 6210 TOF-MS detector and Agilent EC 250/4 Nucleosil

100–5 (HPLC Column, Nucleosil C18, 100A, 5µm, 4×250 mm). The column temperature was adjusted at 35◦C and the injection volume was set to 10µL. Mobile phases A and B were water/1 mL L −1 acetonitrile (0.1% formic acid), respectively. The elution program was: 0–6 min, 40% B; 6–10 min, 50% B; 10–15 min, 90% B; 15–16 min, 90% B; 16–25 min, 40% B. All the measurements for each sample were triplicated.

#### RESULTS

In this study, we have successfully manipulated 4 ′OMT and 7OMT genes in capsule, stem, and leaf tissues of opium poppy. Expression profiles of mRNAs, and alkaloids were analyzed accordingly using proper approaches. We observed correlated results between silenced and overexpressed tissues.

#### Silencing of 4 ′OMT

Effective silencing on 4 ′OMT gene was obtained via agroinfiltration upon VIGS in opium poppy with the rates of 92, 71, and 46% in stem, capsule, and leaf tissue, respectively, (**Figure 2A**). To measure the effect of 4 ′OMT suppression on the other selected transcripts involved in BIA biosynthesis, qRT-PCR assay was carried out and the results showed that these genes were differentially expressed in different tissues. With the exception of the expression level induction in 7OMT and 6OMT genes, all measured genes were detected as down-regulated in stem (**Figure 2B**). The lowest expression was obtained in COR transcript level taking part in downstream of morphinan branch upon 4 ′OMT silencing in the stem. On the other hand, SAT expression level was strongly reduced in the capsule compared to other transcripts via 4 ′OMT silencing. Although suppression in BIA biosynthesis genes was measured in the capsule, 6OMT expression was detected as up-regulated. In leaves, the silencing of 4 ′OMT caused suppression of BIA gene expressions. Among them, the expression of CNMT and 6OMT was highly downregulated with a rate of 72 and 70%, respectively, (**Figure 2B**). Silencing of 4 ′OMT resulted in differential accumulation of BIAs in different tissues. Morphine, codeine, s-reticuline, papaverine, noscapine, thebaine, and laudanosine levels were measured via HPLC-TOF/MS upon 4 ′OMT VIGS. First of all, the relative alkaloid abundance in silenced stem tissue was 41% lower than the control plants (**Figures 2C**, **6A**; Supplementary Figure 1).

Substantial reduction was detected in the accumulation of most of the measured alkaloids upon 4 ′OMT silencing in stem tissue. One the other hand, a considerable induction of total alkaloid content was detected in the capsule. Higher accumulation of morphine, codeine, thebaine, and noscapine rather than papaverine and laudanosine was observed in silenced capsule. In 4 ′OMT-silenced leaf, no significant change was observed in total alkaloid content. Despite comparable total alkaloid amount measured between silenced and control leaf samples, levels of noscapine, and thebaine were changed inversely (**Figures 2C**, **6A**).

#### 7OMT Gene Silencing

VIGS assay was also conducted to discover functional role of 7OMT gene in BIA biosynthesis. 7OMT transcript was successfully suppressed in stem, capsule, and leaves with the rate of 30, 28, and 32%, respectively, (**Figure 3A**). Upon silencing, expression levels of selected BIA biosynthesis genes were differentiated in opium poppy tissues (**Figure 3B**). It was detected that expression levels of 6OMT, 4 ′OMT, SAT, and COR genes were highly reduced upon 7OMT gene silencing in stem. Nevertheless, expression of TYDC gene was found to be induced. In addition transcript levels of measured genes were reduced in capsule tissue. Among them, strong reduction was detected in COR, CNMT, and SAT genes. Furthermore, 7OMT VIGS caused suppression of gene expressions involved in BIA biosynthesis in leaf tissue. 7OMT gene silencing resulted in 66% reduction of total alkaloid accumulation in stem. Amount of the each measured alkaloid was reduced upon 7OMT VIGS in stem (**Figures 3C**, **6A**). Among them, morphine content was remarkably suppressed approximately six-fold, whereas reduction of other alkaloids was about threefold. In opposition, 7OMT VIGS in capsule caused induction of morphine levels. Meanwhile, no significant change was observed in total alkaloid accumulation. On the other hand, compared to morphine level, approximately 1.5-fold lower accumulation of thebaine and noscapine was measured in capsule. Upon 7OMT VIGS, it was observed that noscapine and codeine amounts were significantly reduced in leaf tissue (**Figures 3C**, **6A**).

#### Transient Overexpression of 4 ′OMT Gene in Opium Poppy

4 ′OMT gene was successfully overexpressed under the viral constitutive CaMV35S promoter. The highest overexpression level was found to be 475% in capsule (**Figure 4A**). According to the expression measurement assay results, the most effected transcript was 6OMT in capsule. Moreover, the overexpression of 4 ′ OMT caused a significant increase in the expression of CNMT, SAT, and 7OMT transcript levels in stem, while a negative regulation was observed for COR gene. It was measured that 4 ′OMT was strongly expressed in leaf tissue with a rate of 210% which caused down-regulation of CNMT, TYDC, and 6OMT gene expressions (**Figure 4B**). The highest induction rate was 109% in stem tissue. This overexpression led to induce the SAT and CNMT transcripts. Alkaloid profiles were also affected by the overexpression of 4 ′OMT in opium poppy tissues (**Figures 4C**, **6A**). In stem, increased amount of (S)-reticuline was observed which caused 43% induction of total alkaloid content. Moreover, it was detected that both morphine and noscapine levels induced about two-fold. Total alkaloid measurement in leaf showed that any significant difference was present upon the overexpression. There was a minor alteration for noscapine content in overexpressed opium poppy. On the other hand, a considerable suppression of total alkaloid content (75%) was measured in capsule in response to 4 ′OMT overexpression. The highest reduction was detected for morphine and noscapine content (**Figures 4C**, **6A**).

#### 7OMT Overexpression in Opium Poppy

Here, we performed overexpression of 7OMT gene in all target tissues of opium poppy. Overexpression assay revealed induction of 7OMT in leaf, stem, and capsule with 497, 7614, and 471%, respectively, (**Figure 5A**). The levels of COR, SAT, 6OMT, and 4 ′OMT gene transcripts increased considerably in stem. However, CNMT, 6OMT, and COR levels were significantly decreased in capsule (**Figure 5B**). The results obtained from leaf tissue demonstrated an increase of 6OMT and decrease of CNMT and TYDC expressions. Overexpression of 7OMT gene led to down-regulation of CNMT in all tissues (**Figure 5B**). A reverse expression pattern of 6OMT was observed between

stem and capsule tissues. Its expression was induced in stem while a down-regulation was detected in capsule. The overexpression of 7OMT resulted in a considerable increase of alkaloid accumulation in stem and leaf tissues. Furthermore, higher morphine concentration was measured in stem compared to control plant. Additionally, overexpression of 7OMT reduced noscapine level remarkably in capsule (**Figures 5C**, **6A**).

In addition to major BIAs, we also measured other types of alkaloids such as oripavine, N, N-dimethylnarcotine, dimethylpapaverine, and unidentified molecules in capsule and stem tissues (**Figure 6B**). N, N-dimethylnarcotine level was reduced while dimethylpapaverine level was induced in stem. However, any significant change was detected in capsule upon 4 ′OMT silencing (**Figure 6B**). On the other hand, 7OMT overexpression caused increase of N, N-dimethylnarcotine in capsule.

#### DISCUSSION

Plants are being used as source of secondary metabolites for medicinal purposes. Opium poppy, an agronomically and medicinally important crop, produces opiates such as morphine and noscapine (Unver et al., 2010). In order to understand the biosynthesis mechanism of alkaloids, new approaches such as the examination of the EST database, microarray screening, metabolic engineering, and proteomic tools were applied (Takemura et al., 2013; Gurkok et al., 2015; Ramegowda et al., 2014; Schütz et al., 2014). Furthermore, viral-based gene silencing and over-expression studies were utilized for functional analysis of genes involved in BIA biosynthesis in opium poppy (Allen et al., 2008; Wijekoon and Facchini, 2012; Dang and Facchini, 2014).

Here, to functionally analyze the regulatory roles in BIA biosynthesis in tissue specific manner, expression levels of 4 ′OMT and 7OMT were successfully manipulated in opium poppy. To date, while VIGS of 4 ′OMT and 7OMT was reported in opium poppy (Desgagné-Penix and Facchini, 2012) overexpression results of these genes have not been presented yet.

Silencing and overexpression of genes involved in BIA biosynthesis led to the accumulation of an altered amount of alkaloids in opium poppy tissues (**Figures 2C**, **3C**, **4C**, **5C**). Moreover, transcript levels of BIA biosynthetic genes were

differentially expressed within targeted tissues (**Figures 2B**, **3B**, **4B**, **5B**). Similarly, Dang and Facchini (2014), showed that VIGSmediated suppression of CYP82Y1 gene in opium poppy resulted in diverse transcript levels in different opium poppy tissues (Dang and Facchini, 2014). In another study, the BIA synthetic gene expression differentiation was also reported upon systematic silencing of morphinan pathway genes (Wijekoon and Facchini, 2012).

4 ′OMT2 gene silencing caused reduction in all measured alkaloids in the stem and that total alkaloid content decreased by 41% (**Figure 2C**). Similarly, Desgagné-Penix and Facchini (2012) also reported that the silencing of 4 ′OMT2 in the stem resulted in a 43% total alkaloid reduction (Desgagné-Penix and Facchini, 2012). On the other hand, the total alkaloid accumulation was induced by 43% via overexpression of 4 ′OMT in the stem. Though, Inui et al. (2007) measured the total alkaloid level change upon 4 ′OMT overexpression in Coptis japonica (Inui et al., 2007). In C. japonica the overexpression of this gene had a lesser extent on total alkaloid accumulation. The inconsistency between our results and C. japonica overexpression outcomes might have resulted from the different metabolic branches of BIA pathway in host plants.

We detected correlating morphine content alteration upon gene silencing and overexpression. The silencing of 4 ′OMT caused approximately a two-fold reduction whereas overexpression resulted in a two-fold induction of morphine content in stem (**Figures 2C**, **4C**). Therefore, manipulation of 4 ′OMT leads consistent regulation on morphine content. On the other hand, the expression level differentiation of CNMT and SAT was found to be consistent with the manipulation of 4 ′OMT. Therefore, 4 ′OMT might positively regulate the expression of these genes by adjusting the morphine accumulation (**Figures 2**, **4**). Likewise, recently it was reported that 4 ′OMT expression might be rate-limiting on BIA biosynthesis (Desgagné-Penix and Facchini, 2012). Frick et al. (2007) discussed the rate-limiting activity of NMCH gene for BIA accumulation in opium poppy. It could be concluded that BIA biosynthesis rate-limiting regulation does not depend on only one gene (Frick et al., 2007).

Different alkaloid profiles were analyzed in a capsule tissue from that of a leaf and stem by manipulation of 4 ′OMT gene. The accumulation of morphine, thebaine, and noscapine were increased in the capsule tissue upon 4 ′OMT VIGS. On the other hand, their production was suppressed by 4 ′OMT over expression. Though the expression level of COR and SAT genes were found to be down-regulated, in a capsule of 4 ′OMT they were silenced. It was previously stated that overexpression of COR and SAT genes caused total alkaloid accumulation in opium poppy (Allen et al., 2004, 2008; Larkin et al., 2007). Therefore, morphine accumulation in a capsule might be explained by the induction of COR and SAT gene in the stem and then the transportation of intermediate products from stem to capsule tissue. Furthermore, it can be speculated that an unidentified pathway might have resulted in the accumulation of more alkaloids in the capsule.

Up to date, it has been reported that silencing of 7OMT regulates laudanosine biosynthesis (Desgagné-Penix and Facchini, 2012). Here, we showed for the first time that 7OMT has also considerable effect on biosynthesis of morphine, thebaine and noscapine. Due to different roles of stem and capsule tissues for BIA biosynthesis and accumulation, 7OMT gene might be involved in the biosynthesis of morphine at tissue specific manner. Manipulation of 7OMT altered the amount of morphine in stem. A similar pattern was observed for thebaine accumulation in stem by 7OMT manipulation (**Figures 3C**, **5C**). Our results are consistent with the fact that morphine and thebaine alkaloids are in the same branch of BIA pathway, thus 7OMT shows similar impact on biosynthesis of these alkaloids. Surprisingly, we showed the regulatory action of 7OMT on noscapine biosynthesis, which is placed on a different branch then morphinan alkaloids. Moreover, opposite levels of noscapine were measured between stem and capsule upon manipulation of 7OMT implying tissue specific regulation of 7OMT. Manipulation of 7OMT has impact on expression of 6OMT at different levels in tissues (**Figures 3B**, **5B**). In stem, upon 7OMT VIGS we observed the suppression of 6OMT expression and reduction in total alkaloid amount (**Figures 3B,C**). Similarly, Desgagné-Penix and Facchini (2012) found decrease in total alkaloid content upon 6OMT VIGS (Desgagné-Penix and Facchini, 2012). However, a different pattern was observed in 7OMT-silenced capsule tissue supporting tissue specific action of 7OMT. Since the whole BIA pathway is still unclear, these outcomes might be sourced by putative unknown pathways. As discussed by Leonard et al. (2009), metabolic engineering approaches such as over-expression of a gene can lead to the accumulation of unexpected alkaloid product because of the complex pathways (Leonard et al., 2009).

An unexpected correlation between COR transcript level and thebaine accumulation in stem for both 4 ′OMT and 7OMT assays suggests that the COR encoding gene may have an important effect on the biosynthesis of thebaine. The results showed that

#### REFERENCES

Allen, R. S., Miller, J. A., Chitty, J. A., Fist, A. J., Gerlach, W. L., and Larkin, P. J. (2008). Metabolic engineering of morphinan alkaloids by over−expression and where the COR down-regulated thebaine content was decreased or the up-regulation of COR induced the thebaine levels in stem and capsule with the exception 7OMT induced capsule. Consistent with our results, the silencing of COR caused a decrease in salutaridine, thebaine, codeine, morphine in latex (Wijekoon and Facchini, 2012). Additionally, RNAi applications of COR and SAT genes altered the alkaloid levels of intermediates (Allen et al., 2004; Kempe et al., 2009; Wijekoon and Facchini, 2012). Grothe et al. (2001) showed that the presence of SAT transcript leads to the accumulation of morphinan alkaloids such as morphine, thebaine, and codeine. SAT transcripts levels in stem revealed a correlation especially with morphine accumulation for both suppression and over-expression assays but the same similarity could not be found for other alkaloids. Therefore, both 4 ′OMT and 7OMT might have an effect on SAT transcript in biosynthesis of morphine.

#### CONCLUSION

In the presented study, through the metabolic engineering approaches, we present new results about the regulatory roles for 4 ′OMT and 7OMT in BIA biosynthesis. Upon silencing and overexpression applications, tissue specific effects of these genes were identified. Manipulation of 4 ′OMT and 7OMT genes caused differentiated accumulation of BIAs including morphine and noscapine in capsule and stem tissues.

#### AUTHOR CONTRIBUTIONS

TG, MT, and TU drafted the paper, MT and TU analyzed the data, TU, IP, and SÖ organized and materials and planned the study. EO and TU performed experiments. ID conducted the metabolite measurements. A˙I and SO helped plant growth and inoculation assays.

#### ACKNOWLEDGMENTS

This work was kindly supported by TUBITAK with grant number 111O036. We thank Dr. Fatih Kocabas from the Department of Genetics and Bioengineering, Faculty of Engineering, Yeditepe University, Istanbul, Turkey for his critical reading of the manuscript. The manuscript was edited by a proficient language editor Dr. Bianka Yvamarie Martinez from The International Language Center of Cankiri Karatekin University.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00098

RNAi suppression of salutaridinol 7−O−acetyltransferase in opium poppy. Plant Biotechnol. J. 6, 22–30. doi: 10.1111/j.1467-7652.2007.00293.x

Allen, R. S., Millgate, A. G., Chitty, J. A., Thisleton, J., Miller, J. A., Fist, A. J., et al. (2004). RNAi-mediated replacement of morphine with the nonnarcotic

alkaloid reticuline in opium poppy. Nat. Biotechnol. 22, 1559–1566. doi: 10.1038/nbt1033


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Gurkok, Ozhuner, Parmaksiz, Özcan, Turktas, ˙Ipek, Demirtas, Okay and Unver. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Chimeric Affinity Tag for Efficient Expression and Chromatographic Purification of Heterologous Proteins from Plants

*Frank Sainsbury1,2, Philippe V. Jutras1,2, Juan Vorster3, Marie-Claire Goulet1 and Dominique Michaud1\**

*<sup>1</sup> Département de Phytologie–Centre de Recherche et d'Innovation sur les Végétaux, Université Laval, Québec, QC, Canada, <sup>2</sup> Centre for Biomolecular Engineering, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia, <sup>3</sup> Department of Plant Production and Soil Science, Forestry and Agricultural Biotechnology Institute, University of Pretoria, Pretoria, South Africa*

#### *Edited by:*

*Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Eva Stoger, University of Natural Resources and Life Sciences, Vienna, Austria Audrey Yi-Hui Teh, St George's, University of London, UK*

*\*Correspondence: Dominique Michaud dominique.michaud@fsaa.ulaval.ca*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 23 November 2015 Accepted: 27 January 2016 Published: 15 February 2016*

#### *Citation:*

*Sainsbury F, Jutras PV, Vorster J, Goulet M-C and Michaud D (2016) A Chimeric Affinity Tag for Efficient Expression and Chromatographic Purification of Heterologous Proteins from Plants. Front. Plant Sci. 7:141. doi: 10.3389/fpls.2016.00141*

The use of plants as expression hosts for recombinant proteins is an increasingly attractive option for the production of complex and challenging biopharmaceuticals. Tools are needed at present to marry recent developments in high-yielding gene vectors for heterologous expression with routine protein purification techniques. In this study, we designed the Cysta-tag, a new purification tag for immobilized metal affinity chromatography (IMAC) of plant-made proteins based on the protein-stabilizing fusion partner SlCYS8. We show that the Cysta-tag may be used to readily purify proteins under native conditions, and then be removed enzymatically to isolate the protein of interest. We also show that commonly used protease recognition sites for linking purification tags are differentially stable in leaves of the commonly used expression host *Nicotiana benthamiana*, with those linkers susceptible to cysteine proteases being less stable then serine protease-cleavable linkers. As an example, we describe a Cysta-tag experimental scheme for the one-step purification of a clinically useful protein, human α1-antitrypsin, transiently expressed in *N. benthamiana*. With potential applicability to the variety of chromatography formats commercially available for IMACbased protein purification, the Cysta-tag provides a convenient means for the efficient and cost-effective purification of recombinant proteins from plant tissues.

Keywords: plant molecular farming, protein purification, immobilized metal affinity chromatography, tomato cystatin SlCYS8, human **α**1-antitrypsin

# INTRODUCTION

As the practice of using plants to produce recombinant proteins matures in both industrial and academic contexts (Davies, 2010; Paul and Ma, 2011; Xu et al., 2012), the development of bespoke tags for protein purification has now become of particular relevance (Buyel et al., 2015). Fusion protein tags can permit effective recovery of high purity recombinant proteins from both prokaryotic and eukaryotic expression hosts (Pina et al., 2014), and even improve the stability and solubility of some labile or 'difficult to express' proteins (Terpe, 2003). A current trend in plant-based protein expression is the use of polypeptides, such as hydrophobins (Joensuu et al.,

2010), elastin-like polypeptides (Conley et al., 2009; Floss et al., 2010) and the γ-zein motif ZERA (Torrent et al., 2009), that allow for the stabilization and non-chromatographic purification of recombinant protein fusion partners by the induction of insoluble aggregates. Fusion tags for the stabilization and chromatographic purification of recombinant proteins have also been devised, generally involving biologically active antibody fragments as affinity ligands. An example was provided by Obregon et al. (2006), who showed the α2 and α3 constant regions of a human IgG α-chain fusion partner to increase the accumulation of human immunodeficiency virus-p24 antigen by 13-fold in transgenic tobacco leaves and to allow for its affinity purification with anti-human IgG antisera. More recently, human IgG Fc fragments were used to improve the yields of anthrax toxin receptor (Andrianov et al., 2010) and camelid Nanobodies-R (De Buck et al., 2013) in transgenic plants, and to allow for the one-step purification of these proteins by protein A affinity chromatography.

Our general objective in this study was to engineer a widely applicable affinity chromatography tag for which nonbiological –and cheap– affinity ligands are readily available in a variety of formats. Metal-chelating tags such as the popular poly-histidine (poly-His) tag have proved useful over the years to purify recombinant proteins from a variety of expression hosts by immobilized metal affinity chromatography (IMAC). Poly-His tags grafted at the C- or N-terminus of recombinant proteins have notably been used as purification ligands for protein recovery from different plant tissues (e.g., Leelavathi and Reddy, 2003; Valdez-Ortiz et al., 2005; Vardakou et al., 2012; de Souza et al., 2014), protocols and reagents for IMAC are available from numerous commercial suppliers, and IMAC procedures represent a generally convenient and cost-effective approach for the affinity purification of recombinant proteins at a lab scale (Lichty et al., 2005; Saraswat et al., 2013; Pina et al., 2014). Poly-His tags, however, may negatively affect the expression or activity of certain proteins (Woestenenk et al., 2004; Chant et al., 2005; Amor-Mahjoub et al., 2006; Renzi et al., 2006; Horchani et al., 2009; Sainsbury et al., 2009) and may sometimes be ineffective in native conditions due to their small size and variable accessibility at protein termini (Eschenfeldt et al., 2010). IMAC can be performed under denaturing conditions to make the poly-His motif more accessible (Terpe, 2003), but this is not applicable to those numerous proteins that cannot tolerate denaturation. Likewise, enzymatic procedures –such as the TAGZymeTM system– have been devised for the post-IMAC removal of His tag motifs (Pedersen et al., 1999; Schäfer et al., 2002), but the use of such systems remains costly and hardly accessible to most laboratories (Arnau et al., 2006; Waugh, 2011).

With these limitations in mind, we developed a chimeric poly-His tag, the 'Cysta-tag,' based on our recent observation that translational fusion to tomato multicystatin domain SlCYS8 can sustain, and even enhance, recombinant protein accumulation in leaves of the widely used expression host *Nicotiana benthamiana* (Sainsbury et al., 2013; Robert et al., 2016). We show the insertion of a poly-His motif in a solvent-exposed loop of SlCYS8 to produce an effective tag for IMAC purification of human α1-antitrypsin (α1AT), an anti-inflammatory protein with potential for the augmentation therapy of emphysema and other chronic obstructive pulmonary diseases (Stockley, 2015). We also document the variable stability of common protease cleavage sites for His tag removal, in the specific context of fusion proteins transiently expressed in *N. benthamiana*.

# MATERIALS AND METHODS

# Structural Analyses

Structural models were generated *in silico* for SlCYS8 (GenBank accession no. AF198390) and tentative Cysta-tag hybrids to predict the impact of inserting a (His)6 [or 6x His] hexapeptide motif in the original cystatin structure. Twenty possible models were built for each possible variant using Modeller v. 9.7 (Eswar et al., 2006), with the NMR solution structure coordinates of oryzacystatin I (Nagata et al., 2000) as a template (Protein Data Bank accession no. 1EQK). The stereochemical quality of each model was assessed by comparison with the oryzacystatin structure using the PROCHECK program, v.3.5.4 (http://www.ebi.ac.uk/thornton-srv/ software/PROCHECK/; Laskowski et al., 1993). The best model for SlCYS8 and the best model for a tentative Cysta-tag variant were selected for visualization purposes, Cysta-tag engineering and heterologous expression in *Escherichia coli* or *N. benthamiana*.

# Gene Constructs and Cloning

An α1AT (SERPINA1; Accession No. NM\_000295)-encoding DNA sequence was synthesized by GeneArt (Life Technologies) with an internal synonymous substitution in the original sequence to remove an undesired *Bsa*I restriction site. Sequences encoding SlCYS8, including a secreted version bearing the alfalfa protein disulphide isomerase (PDI) N-terminal signal peptide, were sourced from previously described constructs (Sainsbury et al., 2013). A (His)6 motif was introduced within SlCYS8 by extension overlap PCR between residues alanine (Ala)-62 and glycine (Gly)-63. Constructs for protein expression were assembled using a modified version of GoldenGate cloning (Engler et al., 2008), where counter selection against the *ccd*B gene yields only recombined expression plasmids (Sainsbury et al., 2012, 2013). Protein-encoding PCR fragments with appropriate GoldenGate recombination sites were blunt-end ligated into *Sma*I-digested pUC18strep in the presence of *Sma*I to limit plasmid self-ligation. Complementary oligonucleotides encoding various linker sequences with terminal extensions for recombination were annealed and similarly ligated into pUC18strep. For assembly into expression vectors, recombination reactions between expression plasmids and donor clones were driven by the simultaneous use of *Bsa*I and a T4-DNA ligase. Non-recombined pUC18strep and expression plasmid clones were eliminated by expression of the vectorselecting antibiotic or of the *ccd*B gene, respectively. For expression in *N. benthamiana*, a pEAQ plasmid (Sainsbury et al., 2009) modified to act as an acceptor plasmid for GoldenGate recombination was used (Sainsbury et al., 2013). For expression in *E. coli*, the expression vector pGEX-3X (GE Healthcare) was similarly modified to act as a GoldenGate acceptor (Sainsbury et al., 2012). To construct the green fluorescent protein (GFP) fusion, PCR fragments encoding GFP and the Cysta-tag with complementary overlaps of 20 nucleotides were assembled into pEAQ-*HT* (Sainsbury et al., 2009) linearized with *Age*I and *Stu*I, using the Gibson Assembly Master Mix (New England Biolabs). All constructs were verified by Sanger sequencing before heterologous protein expression.

#### Cysteine Protease Inhibitory Activity

Bacterial expression and purification of recombinant cystatins were carried out as described previously (Sainsbury et al., 2013). Protein concentrations were determined by densitometry of Coomassie blue-stained gels after SDS-PAGE using the Phoretix 2-D Expression software, v. 2005 (Nonlinear Dynamics) and bovine serum albumin (Sigma–Aldrich) as a protein standard. Anti-papain activity of SlCYS8 and the Cysta-tag was determined by the monitoring of papain proteolysis progress curves with the fluorigenic synthetic peptide *Z*-Phe–Arg-methylcoumarin as a substrate, as described earlier (Goulet et al., 2008).

#### Plant-Based Expression

pEAQ vectors for *in planta* expression were maintained in *Agrobacteria tumefaciens* strain AGL1 following transformation by electroporation. Bacterial cultures were first grown in lysis broth medium supplemented with appropriate antibiotics, and the bacteria then collected by gentle centrifugation. Bacterial pellets were resuspended in leaf infiltration medium (10 mM MES, pH 5.6, containing 10 mM MgCl2 and 100 μM acetosyringone) and incubated for 2–4 h at room temperature prior to transfection. Leaf infiltration was performed using a needle-less syringe as described earlier (D'Aoust et al., 2009), after mixing each protein-encoding (or empty vector) agrobacterial suspension with an equal volume of bacteria carrying the pEAQ express vector for transgene silencing suppression (Sainsbury et al., 2009). Infiltrated leaf tissue was collected 7 days postinfiltration for recombinant protein extraction and analysis, after incubating the plants at 23◦C under a 16 h:8 h day–night photoperiod.

## Protein Extraction and Gel Electrophoresis

Leaf disks representing 160 mg of control (empty vector) infiltrated tissue were harvested to determine protein expression rates following heterologous expression. The leaf disks were homogenized by disruption with a bead mill in three volumes of phosphate-buffered saline (PBS), pH 7.3, containing 5 mM EDTA, 0.05% (v/v) Triton X-100 (Sigma) and the cOMPLETE protease inhibitor cocktail for endogenous protease neutralization (Roche). Cell lysates were clarified by centrifugation at 20,000 × *g* for 5 min at 4◦C and protein concentrations determined using the Bradford assay reagent (Thermo Scientific) with bovine serum albumin as a protein standard. Protein extracts were resolved by SDS-PAGE prior to Coomassie blue staining or immunodetection.

#### Immunoblotting

Proteins for immunoblotting were resolved by 12% (w/v) SDS-PAGE and electrotransferred onto nitrocellulose sheets. Non-specific binding sites after electrotransfer were saturated with 5% (w/v) skim milk powder in PBS containing 0.025% (v/v) Tween-20, which also served as a dilution buffer for the antibodies. Human α1AT was detected with commercial polyclonal IgG raised in rabbit against this protein (US Biologicals) and alkaline phosphatase-conjugated goat antirabbit IgG secondary antibodies (Sigma–Aldrich). SlCYS8 and the Cysta-tag were detected with commissioned polyclonal IgG (Agrisera) raised in rabbit against a bacterially expressed SlCYS8 (Girard et al., 2007) and alkaline phosphatase-conjugated goat anti-rabbit IgG secondary antibodies (Sigma–Aldrich). The Cysta-tag was also detected with mouse anti-poly-His IgG (Cell Signaling Technologies) and horseradish peroxidase-conjugated IgG secondary antibodies (Sigma–Aldrich). GFP was detected with mouse anti-GFP antibodies (Cell Signaling Technologies) and horseradish peroxidase-conjugated secondary antibodies. Colorimetric signals for phosphatase-conjugated antibodies were developed with nitro blue tetrazolium chloride and 5-bromo-4-chloro-3-indolyl phosphate as a substrate (Sigma–Aldrich). Electrochemiluminescent signals for peroxidase-conjugated antibodies were generated with the Clarity Western ECL SubstrateTM (Bio-Rad).

#### Quantitative ELISA

Enzyme-linked immunosorbent assays (ELISA) were performed to quantify α1AT based on a procedure described earlier for human α1-antichymotrypsin (α1ACT; Sainsbury et al., 2013). Immulon 2HB ELISA plates (Thermo Scientific) were coated with duplicate samples of soluble protein extract diluted 1:50 to 27–30 μg/mL in PBS, pH 7.3. Non-specific binding sites were blocked with 1% (w/v) casein in PBS before application of anti-human α1AT diluted in PBS with 0.25% (w/v) casein. Anti-rabbit IgG conjugated to horseradish peroxidase were used as secondary antibodies, followed by colour signal development with the 3,3 ,5,5 -tetramethylbenzidine SureBlueTM peroxidase substrate (KPL). The absorbance was read at 450 nm after adding 1 N HCl to stop color development. A standard curve was generated for each plate with human α1AT (EMD Chemicals) diluted in control extracts from tissue infiltrated with an empty vector, to account for possible matrix effects.

## Fusion Protein Purification

Cysta-tagged proteins were purified from crude protein extracts using the ÄKTA Prime Plus Liquid Chromatography System (GE Healthcare). Cysta-tagged α1AT was purified from 5 g of infiltrated tissue ground in liquid nitrogen and resuspended in three volumes of extraction buffer (20 mM sodium phosphate, pH 7.4, 0.5 M NaCl) containing EDTA-free cOMPLETE protease inhibitor cocktail (Roche). Leaf tissue expressing the Cystatag–GFP fusion was disrupted in three volumes of the same buffer using a PT1200 Polytron homogenizer (Kinematica). The leaf lysates were clarified by centrifugation at 20,000 × *g* for 10 min at 4◦C, the supernatants frozen overnight at –80◦C,

and the mixtures centrifuged again to remove insoluble debris. Dithiothreitol (DTT; to 1 mM) and imidazole (to 20 mM) were added to the extracts, and the mixtures submitted to a final centrifugation round to remove precipitates. The resulting extracts were passed through a 0.45 μm syringe filter and approximately 12 ml was injected into a 5-ml sample loop in order to fill the loop completely. Samples were loaded onto 1-ml HisTrap columns for IMAC (GE Healthcare) and washed with extraction buffer containing 1 mM DTT and 20 mM imidazole. Immobilized proteins were eluted with 400 mM imidazole in extraction buffer containing 1 mM DTT, and the recovered fractions stored at –80◦C or immediately prepared for SDS-PAGE. Yield and purity of the α1AT eluates were calculated relative to starting amount of α1AT and total protein content, respectively.

#### Cysta-Tag Proteolytic Removal

Cysta-tag removal was done by protease treatment with the common linker processing enzyme human factor Xa (New England Biolabs). Purified protein samples were dialyzed overnight in 20 mM Tris-HCl, pH 8.0, containing 100 mM NaCl and 2 mM CaCl2. The Cysta-tag–protein (GFP) fusion was adjusted to a working concentration of 500 ng/μl and digested with factor Xa at molar ratios of 1:20 or 1:50. Protease reactions were performed at 20◦C in total volumes of 100 μl. Samples were collected at different time points, and factor Xa activity stopped by the addition of SDS-PAGE sample loading buffer and heating for 5 min at 95◦C.

#### RESULTS AND DISCUSSION

#### Cysta-Tag Design, Structure, and Expression

We took a rational approach to the design of the Cysta-tag, taking into account proximity of the poly-His tag insertion site to (1) the two inhibitory loops and N-terminal trunk of SlCYS8, which both contribute to the biological activity of the protein (Benchabane et al., 2010); and (2) the C-terminus of the cystatin, given the need to avoid steric interference from the fusion partner on IMAC substrate binding. Modeling attempts with these considerations in mind led us to select a structurally unconstrained site for the insertion of a (His)6 motif, between residues alanine (Ala)- 62 and glycine (Gly)-63 (**Figure 1A**). *In silico* modeling of the chimeric protein resulted in a putative Cysta-tag variant with a predicted tertiary structure closely matching the tertiary structure of SlCYS8, aside from an extended surface loop with the poly-His motif away from the protease inhibitory loops (**Figure 1B**). Ramachandran plots were produced with the inferred structural coordinates of SlCYS8 (not shown) and the 'Ala-62–(His)6–Gly-63' Cysta-tag (**Figure 1C**) to confirm the stereochemical quality of our structural models (Laskowski et al., 1993). For both two proteins, more than 90% of the amino acid residues (black dots on **Figure 1C**) fell within the 'most favored' (red) and 'additional allowed regions' (yellow) areas of the graph, indicating adequate stereochemical quality of the predicted structures (Morris et al., 1992) and eventual robustness of the chimeric protein tag.

We produced the –Ala-62–(His)6–Gly-63–Cysta-tag variant in *E. coli* to assess its overall stability and protease inhibitory activity compared to SlCYS8, using papain as a model target protease. Confirming a negligible impact for the inserted (His)6 motif on the cystatin template, an apparent dissociation constant of 42 nM was calculated for the Cysta-tag toward papain, similar to a dissociation constant of 43 nM determined for the original cystatin. Gene constructs were assembled to express SlCYS8 and the Cysta-tag in *N. benthamiana* leaves, either retained in the cytosol or targeted to the cell secretory pathway, to assess the impact of (His)6 motif insertion on stability of the cystatin *in planta* (**Figure 2A**). Coomassie blue-stained polyacrylamide slab gels following SDS-PAGE (**Figure 2B**) and immunoblots to confirm their identity (not shown) showed SlCYS8 and the Cystatag to accumulate at similar levels in both the cytosol and the apoplast, again indicating little impact of the poly-His motif on SlCYS8 overall stability and suggesting eventual robustness of the Cysta-tag as a fusion partner *in planta*.

# Expression of Cysta-Tag Fusions in *N. benthamiana*

We recently reported a strong positive effect of SlCYS8 used as a fusion partner on stability of an α1AT-related protein, α1ACT (Baker et al., 2007), in *N. benthamiana* leaves (Sainsbury et al., 2013). To demonstrate effectiveness of the Cysta-tag protein as a fusion partner moiety in plants, we chose to tag the clinically relevant α1AT, which has been the target for recombinant protein expression in plants where it can be produced in an active form (Terashima et al., 1999; Nadai et al., 2009; Zhang et al., 2012; Castilho et al., 2014). Because glycosylation of α1AT imparts increased stability to the normally secreted protein (Kwon and Yu, 1997), fusions were generated with a secreted version of the Cysta-tag (**Figure 3A**). For the gene constructs we used a mature form of α1AT lacking 23 amino acids at the N-terminus, as no well defined structure could be observed in this region by X-ray crystallography (Patschull et al., 2011) and because the corresponding N-terminal sequence in α1ACT, also presenting an

undefined structure (Wei et al., 1994), was reported to undergo restricted proteolysis in a plant cell environment (Benchabane et al., 2009).

We fused α1AT to the Cysta-tag using the generic flexible peptide linker Gly*(*4*)*Ser, reported earlier to be proteolysisresistant in the secretory pathway of *N. benthamiana* leaf cells (Sainsbury et al., 2013). Expressing un-tagged α1AT as well as fusions to either SlCYS8 or the Cysta-tag showed no positive effect of the plant cystatin on steady-state levels of secreted <sup>α</sup>1AT 7 days post-infiltration (**Figures 3B,C**), in sharp contrast with the positive effect reported earlier for SlCYS8 used as a fusion partner for α1ACT (Sainsbury et al., 2013). This, however,

representation of the Cysta-tag (CT)–α1AT fusions, showing domain organization, location of the poly-His motif (black area) and identity of the inserted linker (L). F, flexible linker stable *in planta* (see Sainsbury et al., 2013); Xa, cleavage motif for human factor Xa; EK, cleavage motif for bovine enterokinase; TEV, cleavage motif for *Tobacco etch virus* protease; 3C, cleavage motif for *Rhinovirus* 3C protease. Arrowheads in upright position show protease cleavage sites. (B) ELISA for quantitation of α1AT in crude protein extracts of leaves expressing secreted α1AT or the different Cysta-tag–α1AT fusions. Each bar is the mean of three replicate values ±SD. (C) SDS-PAGE separation of α1AT and Cysta-tag hybrids and their immunodetection with anti-α1AT and anti-SlCYS8 polyclonal antibodies. Asterisk (∗) on the right highlights free α1AT in leaf extract, arrowhead Cysta-tag–α1AT fusion protein, and closed circle free Cysta-tag. Human α1AT on right lanes of the Coomassie blue-stained gel and anti-α1AT immunoblot corresponds to a commercially available, highly glycosylated form of the protein.

could be expected given the very high accumulation rate of more than 1.5 mg/g fresh weight tissue measured for α1AT (**Figure 3B**), much higher than the accumulation rate obtained for α1ACT expressed in the same expression system (Sainsbury et al., 2013). Most importantly, fusion to the Cysta-tag resulted in expression levels comparable to those observed with SlCYS8– <sup>α</sup>1AT (**Figures 3B,C**).

Since the Cysta-tag is biochemically active and because it could, with a molecular mass of ∼11 kDa, physically interfere with stability and activity of the fusion partner, we investigated the resistance of commonly used cleavable linkers to degradation by host plant endogenous proteases. To this end, we assembled Cysta-tag–α1AT fusions with cleavage motifs for two Ser proteases, bovine enterokinase and human factor Xa, and for two Cys proteases, *Tobacco etch virus* (TEV) protease and *Rhinovirus* 3C protease (3C; **Figure 3A**). The four linkers showed variable stability *in planta*, with those acted on by Ser proteases being substantially more stable than those recognized by Cys proteases, which both led to an accumulation of free Cysta-tag detectable on Coomassie blue-stained gels (**Figure 3C**). In addition to being stable in the plant cell secretory pathway, the enterokinase and factor Xa cleavage motifs do not leave residual amino acids at the N-terminus of downstream-located proteins upon cleavage, unlike cleavage motifs of the TEV and 3C proteases expected to leave two or three non-native residues (see **Figure 3A** for expected cleavage sites). Our observations suggest that Ser protease-cleavable motifs such as those of enterokinase and factor Xa may be most useful in *N. benthamiana* expression

#### FIGURE 4 | Continued

IMAC purification of the Cysta-tag–**α**1AT fusion. Cysta-tag and α1AT separated by the factor Xa-cleavable sequence were transiently expressed in *N. benthamiana* leaves. (A) α1AT recovery (line) and purity (columns) rates during purification. Calculations were made based on ELISA assays for α1AT and leaf total soluble protein determinations. Data are mean values of three independent purification rounds ±SD. (B) Coomassie blue-stained polyacrylamide gels following SDS-PAGE showing key protein fractions during the purification process. Identity of the Cysta-tag–α1AT fusion was confirmed by immunodetections with anti-α1AT and anti-SlCYS8 primary antibodies (C). Arrowheads point to the Cysta-tag–α1AT fusion, closed circles to free Cysta-tag and asterisk (∗) to free α1AT.

platforms, both as stable linkers *in planta* before leaf processing and as convenient cleavable linkers for tag removal following recombinant protein purification.

#### Purification of Plant-Made **α**1AT via the Cysta-Tag

We submitted leaf tissue expressing Cysta-tag–α1AT to a routine IMAC procedure to confirm usefulness of the Cystatag as an affinity ligand for recovery of heterologous proteins in native conditions (**Figure 4**). A preliminary freeze/thaw treatment of clarified extracts resulted in the precipitation of a significant fraction of ribulose-1,5-*bis*phosphate carboxylase oxygenase (Rubisco), a major protein contaminant in crude leaf extracts (Robert et al., 2015), with no notable loss of the Cysta-tagged fusion (**Figure 4A**). Approximately 22% of total soluble proteins was lost during the process, compared to only a 1% average decrease in α1AT as measured by quantitative ELISA. Addition of imidazole to the protein extracts after freezing resulted in substantial protein precipitation leading to a further 18% loss of total protein and a concomitant 30% loss of α1AT relative to initial level in untreated extracts. These numbers represent an average across three independent purifications of factor Xa-linked fusion protein. Similar rates were obtained with the enterokinase-linked Cysta-tag–α1AT fusion (Supplementary Figure S1), which suggests consistency of protein precipitation rates across leaf pre-purification steps and no significant impact of the cleavage linker on protein loss during early downstream processing.

Coomassie blue-stained gels and immunoblots confirmed that most of the Cysta-tag–α1AT fusion in clarified extracts was retained in HisTrap columns, while un-tagged α1AT did not bind in the presence of 20 mM imidazole (**Figure 4B**). During optimization of the elution conditions, we found that lower concentrations of imidazole resulted in non-specific binding of RuBisCO into the column, and higher concentrations to reduced retention of the Cysta-tag fusion. The 55-kDa protein fusion expected for Cysta-tagged α1AT was effectively eluted by the addition of 400 mM imidazole, to give final recovery yield and purity rate of about 25 and 90%, respectively, for both factor Xa- and enterokinase-cleavable fusions (**Figure 4** and Supplementary Figure S1). Since starting amounts of α1AT in leaves were around 1.6 mg per g fresh weight (*see* **Figure 3B**), this represents a recovery rate for the expressed protein of ∼0.4 mg per g fresh weight, equivalent to about 5% of total extracted leaf protein. A protein contaminant likely corresponding to free Cysta-tag was sometimes visible in Coomassie blue-stained gels after purification, which could in theory be removed along with released Cysta-tag subsequent to proteolytic tag removal by further IMAC or other techniques such as ion exchange or size exclusion chromatography.

#### Enzymatic Removal of the Cysta-Tag

Initial attempts to digest factor Xa- and enterokinase-cleavable Cysta-tag–α1AT fusions were not successful, likely due to either steric hindrance at the protease cleavage site or to the well-documented inhibitory effect of α1AT against several Ser proteases including factor Xa (Morrissey, 1998). To confirm utility of the Cysta-tag approach to produce proteins free of their affinity partner, we designed a factor Xa-cleavable Cystatag fusion with GFP, also taking this opportunity to use a non-secreted version of the Cysta-tag to direct fusion protein accumulation in the cytosol. Cysta-tag–GFP was expressed in *N. benthamiana* leaves and purified as described above for the Cysta-tag–α1AT fusions (**Figure 5**). As expected, Coomassie blue-stained gels following SDS-PAGE (**Figure 5A**) and immunoblotting of both GFP and the Cysta-tag poly-His motif (**Figure 5B**) confirmed purification to high purity of a 38-kDa Cysta-tag–GFP fusion product, along with a product of higher molecular weight likely corresponding to Cysta-tag– GFP::GFP–Cysta-tag dimers as a result of GFP dimerization (Tsien, 1998). These observations demonstrate the potential of Cysta-tag-based expression for the affinity purification of recombinant proteins from leaf tissues regardless of their subcellular localization in transfected cells. They also show that the Cysta-tag could be detected with anti-poly-His antibodies (**Figure 5B**) and thus be used to monitor recombinant protein expression and purification processes.

Enzymatic removal of the Cysta-tag was performed with molar ratios of factor Xa fixed at 1:20 (as suggested by the manufacturer) and 1:50 relative to the recombinant protein (**Figure 5**). Both protease concentrations allowed for an effective cleavage of the fusion, readily reducing the amount of intact 38 kDa fusion protein –and dismantling the GFP fusion dimers– to generate a 27-kDa protein corresponding to free GFP. Cleavage at the 1:20 ratio was nearly complete after 4 h, confirming efficient removal of the affinity tag under standard processing conditions. The 1:50 ratio required an overnight incubation but the cleavage was also complete, pointing to the possibility of minimizing protein sample contamination with residual factor Xa after enzymatic cleavage for those proteins that support long incubation periods.

#### CONCLUSION

Our goal in this study was to devise a protease-removable fusion tag for the IMAC purification of plant-made proteins in native conditions. Building upon our finding that tomato cystatin SlCYS8 can act as a stabilizing fusion partner for secreted proteins in plant leaf biofactories (Sainsbury et al., 2013), we engineered a chimeric tag for IMAC that is also detectable

using readily available anti-poly-His antibodies and thus useful to monitor the expression and purification of recombinant proteins. Through molecular modeling we identified a physically unconstrained site for poly-His motif insertion, located in an exposed surface loop of SlCYS8 distal to both the inhibitory

loops and N-terminus involved in protease inhibitory activity. The expression rate, overall stability and anti-papain potency of the resulting chimeric protein were unaltered compared to the parent protein, SlCYS8. This novel tag can be linked to a protein of interest using peptide linkers encoding different endoprotease cleavage sites. Protein purification using the Cystatag results in efficient and reproducible recovery of high-quality protein products, regardless of their subcellular localization *in planta*. Our results demonstrate the general usefulness of Cysta-tag fusions for recombinant protein purification from plant sources under mild, non-denaturing conditions. Indeed, a recent study reported the use of the Cysta-tag for the heterologous expression and purification of a difficult to express plant β-glucosidase in native conditions, permitting a first functional characterization of this previously intractable enzyme (Mageroy et al., 2015). We here developed the Cystatag for the purification of plant-made proteins, but the diversity of tools and protocols already available for poly-Hisbased IMAC in various expression systems make our new approach potentially applicable to any prokaryotic or eukaryotic host.

# AUTHOR CONTRIBUTIONS

FS conceived the study with DM, took charge of the experimental design, performed lab experiments with PJ, and wrote a first draft of the manuscript. PJ contributed to the experimental design, performed lab experiments with FS, and contributed

#### REFERENCES


to the first draft of the manuscript. JV conceived the Cystatag with FS, performed the *in silico* (modeling) analyses, and contributed to the writing of the manuscript. M-CG contributed to the experimental design, coordinated the lab experiments, and contributed to the writing of the manuscript. DM conceived the study, contributed to the experimental design, coordinated the study, and prepared the last version of the manuscript.

#### FUNDING

This work was supported by a Discovery grant from the Natural Science and Engineering Research Council (NSERC) of Canada to DM, and by an Australian Research Council (ARC) Discovery Early Career Research Award to FS (DE140101553).

#### ACKNOWLEDGMENTS

PJ was the recipient of an AgroPhytoSciences NSERC–FONCER scholarship and of a BMP graduate scholarship funded by NSERC, the Fonds de Recherche Québec Nature et Technologies and our private research partner Medicago Inc.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.00141

change at the DNA-binding site. *Protein Expr. Purif.* 39, 152–159. doi: 10.1016/j.pep.2004.10.017


Zhang, L., Shi, J., Jiang, D., Stupak, J., Ou, J., Qiu, Q., et al. (2012). Expression and characterization of recombinant human alpha-antitrypsin in transgenic rice seed. *J. Biotechnol.* 164, 300–308. doi: 10.1016/j.jbiotec.2013.01.008

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Sainsbury, Jutras, Vorster, Goulet and Michaud. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Antigen Production in Plant to Tackle Infectious Diseases Flare Up: The Case of SARS

*Olivia C. Demurtas1, Silvia Massa1, Elena Illiano2,3, Domenico De Martinis4, Paul K. S. Chan5,6, Paola Di Bonito7\* and Rosella Franconi2\**

*<sup>1</sup> Department of Sustainability, Biotechnology Laboratory, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Rome, Italy, <sup>2</sup> Department of Sustainability, Biomedical Technology Laboratory, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Rome, Italy, <sup>3</sup> Department of Pharmacological and Biomolecular Sciences, Università degli Studi di Milano, Milan, Italy, <sup>4</sup> International Relations Office, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Rome, Italy, <sup>5</sup> Department of Microbiology, Faculty of Medicine, The Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong, China, <sup>6</sup> Centre for Emerging Infectious Diseases, Faculty of Medicine, The Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong, China, <sup>7</sup> Istituto Superiore di Sanità, Department of Infectious, Parasitic and Immune-Mediated Diseases, Rome, Italy*

#### *Edited by:*

*Agnieszka Ludwików, Adam Mickiewicz University in Poznan, Poland ´*

#### *Reviewed by:*

*Manoj K. Sharma, Jawaharlal Nehru University, India Taras P. Pasternak, Albert-Ludwigs-Universität Freiburg, Germany*

#### *\*Correspondence:*

*Rosella Franconi rosella.franconi@enea.it; Paola Di Bonito paola.dibonito@iss.it*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 04 August 2015 Accepted: 13 January 2016 Published: 05 February 2016*

#### *Citation:*

*Demurtas OC, Massa S, Illiano E, De Martinis D, Chan PKS, Di Bonito P and Franconi R (2016) Antigen Production in Plant to Tackle Infectious Diseases Flare Up: The Case of SARS. Front. Plant Sci. 7:54. doi: 10.3389/fpls.2016.00054*

Severe acute respiratory syndrome (SARS) is a dangerous infection with pandemic potential. It emerged in 2002 and its aetiological agent, the SARS *Coronavirus* (SARS-CoV), crossed the species barrier to infect humans, showing high morbidity and mortality rates. No vaccines are currently licensed for SARS-CoV and important efforts have been performed during the first outbreak to develop diagnostic tools. Here we demonstrate the transient expression in *Nicotiana benthamiana* of two important antigenic determinants of the SARS-CoV, the nucleocapsid protein (N) and the membrane protein (M) using a virus-derived vector or agro-infiltration, respectively. For the M protein, this is the first description of production in plants, while for plantderived N protein we demonstrate that it is recognized by sera of patients from the SARS outbreak in Hong Kong in 2003. The availability of recombinant N and M proteins from plants opens the way to further evaluation of their potential utility for the development of diagnostic and protection/therapy tools to be quickly manufactured, at low cost and with minimal risk, to face potential new highly infectious SARS-CoV outbreaks.

Keywords: SARS-CoV, severe acute respiratory syndrome (SARS), N protein, M protein, plant expression, disease outbreaks, emerging infectious disease

# INTRODUCTION

Recombinant proteins expressed in plants have emerged as a novel branch of the biopharmaceutical industry and hold great potential to produce different types of therapeutic proteins at low cost and with reduced risks of contamination with human and animal pathogens (Moustafa et al., 2015; Paul et al., 2015; Sack et al., 2015). Transient expression of target proteins can be easily achieved by plant viruses or by agroinfiltration (Gleba et al., 2007), saving the time spent in the generation of transgenic plants, often allowing higher protein yield due to the absence of chromosomal integration and consequently of position effects (Komarova et al., 2010). Transient expression can also be used as a means for preliminary evaluation of correct expression before starting the generation of transgenic plants, or related platforms, such as plant cell cultures or microalgae (Franconi et al., 2010).

Antigen preparation plays a crucial role in the development of a diagnostic test, and plants represent an ideal biofactory system. The approach could be extended to other cases when a pathogen cannot be grown in the lab or is highly virulent and needs a methodology for fast and affordable production. Indeed, virus-specific and 'orphan' vaccine candidates and therapeutics represent one of the most interesting applications of the plant-based technology, especially when it is necessary to produce 'rapid response' vaccines such as those directed against bioterrorism agents and diseases with pandemic potential, like influenza. The capacity for such a response has already been demonstrated by the (four) companies involved in the US in the production of 100 million doses of influenza vaccine a month (Rybicki, 2014). This opens the way for the use of this technology for other diseases such as, to name a few, Ebola (Henao-Restrepo et al., 2015; Heymann et al., 2015), avian flu (Su et al., 2015), MERS (Al-Tawfiq and Memish, 2015; Keener, 2015) and other viruses where the principles of the so-called "One Health Initiative" (strategies to control diseases across species) are important (http://www.onehealthinitiative.com/). The fact that the epidemiology of these diseases is associated to sudden and sometimes unforeseen contagious burst, results in an on/off attention about, in terms of research, prevention and pharma industry efforts (Barber, 2014; LaMattina, 2014). For this reason, tools for a quick and relatively easy scale-up may provide a solution to such emergencies (Streatfield et al., 2015).

Among emerging and re-emerging diseases, severe acute respiratory syndrome (SARS) appeared in China in November 2002. The epidemic spread to 29 countries over 5 continents, leading to more than 8000 infected patients globally (World Health Organization [WHO], 2006) with a fatality rate of 9.6% (Suresh et al., 2008). The aetiological agent of the syndrome, the coronavirus SARS-CoV, was rapidly identified (Drosten et al., 2003; Marra et al., 2003; Rota et al., 2003). The end of the SARS outbreak was declared by World Health Organization [WHO] (2003) in July, thanks to strong containment measures. However, several local outbreaks were subsequently reported in China, as a consequence of accidental laboratory contaminations or infections after contact with animals infected with SARS-CoV strains significantly different from those predominating in the 2002–2003 outbreak (Peiris et al., 2004). While no effective therapy is currently available, considerable efforts have been made to develop vaccines and drugs to prevent SARS-CoV infection, since a SARS epidemic may recur at any time in the future (Gimenez et al., 2009; Chan and Chan, 2013). Moreover, because of the highly contagious nature of the disease, and since SARS-CoV has been defined a potential biological weapon (Centers for Disease Control and Prevention [CDC], 2012), it is still important to develop effective SARS-CoV sensitive diagnostic tools (Suresh et al., 2008).

The large SARS-CoV genome, a polyadenylated RNA of 29,727 nucleotides, encodes four major viral structural components, the spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins, and 16 non-structural proteins (Bartlam et al., 2007).

The structural N protein is the most abundant protein in the SARS-CoV virion. It is a highly basic protein of 422 amino acids (46 kDa) of the helical nucleocapsid, playing an important role in viral pathogenesis, replication, RNA binding, cell cytokinesis and proliferation (Surjit and Lal, 2008). N protein has been recognized as the preferred target for detection of SARS-CoV infection by reverse transcription-polymerase chain reaction (RT-PCR; Suresh et al., 2008). In addition, the WHO guidelines for SARS diagnosis, developed during the outbreak in 2003, suggested the use of N-based ELISA for specific IgG detection as confirmatory test of SARS-CoV infection (World Health Organization [WHO], 2003 SARS: Laboratory diagnostic tests) due to the ability of the host to mount an early antibody response against the N protein (Che et al., 2004).

Furthermore, since the N protein is able to induce a long-term cell-mediated immune response in animal models, it represents a potential vaccine candidate as well (Roper and Rehm, 2009). To date, the production of recombinant N protein has been achieved in a variety of heterologous expression systems, including plants, (Zheng et al., 2009), providing proofs of concept for its use in vaccine formulations (Roper and Rehm, 2009). However, the immune response in animal models (both natural and nonnatural SARS-CoV hosts) might be not useful to predict the human immune response.

The M protein is the most abundant protein in the SARS-CoV viral envelope. It is functionally involved in the assembly and budding of virions from the cell. M protein forms homo-oligomers and interacts with S, E, and N proteins (Hogue and Machamer, 2008). It consists of 221 amino acids (25 kDa), with a short glycosylated N-terminal domain, three membrane-spanning domains and a long immunogenic C-terminal cytoplasmic domain. It has been reported that rabbit antiserum raised against recombinant M protein produced in yeast has a potent neutralizing activity *in vitro* (Pang et al., 2004). Antibodies to the M protein were detectable in convalescent SARS patients and B-cell epitopes of the M protein have been identified (He et al., 2005). It has also been shown that M acts as a dominant immunogen for CTL response in humans (Liu et al., 2010). Moreover, it has been demonstrated that SARS-CoV M-specific memory CD4+ and CD8+ T cells were persistent in the peripheral blood of recovered SARS patients more than 1 year after infection (Yang et al., 2007). In a study, where different DNA vaccines were used, M generated the strongest T- cell response in an animal model, and recovered SARS patients had a long-lasting CD4+ and CD8+ memory for the M antigen (Roper and Rehm, 2009). These data suggest that further research should be directed toward evaluating the potential efficacy of the M antigen for vaccine and diagnostic tools development.

In this study, we demonstrate the feasibility of using plant transient expression systems (Potato Virus X [PVX]-mediated infection and agroinfiltration) to produce two SARS-CoV antigens, the N and M proteins, as useful tools to face SARS-CoV infection.

In particular, we demonstrated that the SARS-CoV N protein produced in *Nicotiana benthamiana* is recognized by the specific antibodies of convalescent SARS patients. Moreover, the expression of the SARS-CoV M protein was achieved for the first time in plant.

The approach to rapidly get crucial antigens by transient expression in plants, potentially attains to other infectious, either emerging or re-emerging, diseases (e.g., MERS, Avian flu, Ebola), that share with SARS the features of rapid outbreak burst and need to rapidly produce diagnostic or therapeutic tools.

## MATERIALS AND METHODS

#### Cells

*Escherichia coli* XL1 blue strain was used as a host for cloning and protein expression. Cells were grown in Luria-Bertani medium at 37◦C with shaking at 250 rpm. *Agrobacterium tumefaciens* GV3101 and C58C1 strains were used to transiently express the M protein in *N. benthamiana* and were grown in YEB medium (5 g/l beef extract, 1 g/l yeast extract, 5 g/l peptone, 5 g/l sucrose, 2 mM MgSO4) at 28◦C with shaking at 250 rpm. When necessary, ampicillin (100 µg/ml) or kanamycin (25 µg/ml) was added to the culture medium. HEK-293 cells were cultivated as a monolayer in DMEM medium with 10% fetal bovin serum (FBS) and 50 µg/ml gentamicin at 37◦C with 5% of CO2 and relative humidity of 94%.

### DNA Manipulation for Bacterial and Plant Expression of the SARS-CoV N and M Proteins

The nucleocapsid (N, GenBank protein id. AAP33707.1) and membrane (M, GenBank protein id. AAP33701.1) full-length genes of the human SARS-CoV Frankfurt I isolate, Acc. No. AY291315 were cloned into pCR2.1-TOPO TA Cloning vector (Invitrogen; Carattoli et al., 2005). The inserted fragments were cut out by digestion with BamHI-NotI and sub-cloned into the pQE-30 (Qiagen) prokaryotic expression vector (pQE-30-N and pQE-30-M).

The full-length *N* gene (1269-bp long) was amplified by PCR on the template plasmid pQE30-N with *Pfu* polymerase with the forward primer 5 - GGCCATCGAT*GAATTC*GGATCCATC**ATG**AGAGGATCGC ATCACATCC-3 (*ClaI* restriction site: underlined, *EcoRI* site: italic, initiation translation codon: bold) and the reverse primer 5 -GACTTGTCGAC*GCGGCCGC***TTA** TGCCTGAGTTGAATCAGCAG-3 (*SalI* site: underlined, *NotI* site: italic, stop codon: bold). For mammalian cells expression, the PCR product was cut out by digestion with *EcoRI/NotI* and inserted into the pVAX1 vector (Invitrogen). For plant expression, the PCR product was cut out with *ClaI/SalI* and cloned into the pPVX201 plant vector (Baulcombe et al., 1995). In this vector, the full-length viral cDNA of the Potato virus X (PVX) is inserted between the constitutive 35S promoter of the cauliflower mosaic virus (CaMV 35S) and the transcription terminator (Nos-term) of the nopaline synthase gene of *A. tumefaciens*, necessary for the regulation of the viral genome expression upon infection with plasmid DNA. Characteristic components of the viral expression vector are the following: viral replicase gene (RdRp); triple gene block encoding protein for cell-to-cell movement (M1-3); viral coat protein gene necessary for encapsidation of viral RNA (CP); coat protein sub-genomic promoter (SgP; **Figure 1A**).

The full-length *M* gene (666-bp long) from the pCR2.1-TOPO TA vector was amplified in two subsequent steps. First, the forward primer 5 - GGCCATCGAT*GAATTC*GGATCCATC**ATG**GCAGACAACGG TACTATTAC-3 (*ClaI* restriction site: underlined, *EcoRI* site: italic, initiation translation codon: bold) and the reverse primer 5 -*GTGATGGTGATGA TG*CTCGAGTGCCTG**TACTAGCAAAGCAATATT**-3 (end of the His6 tag: italic, end of the *M* gene: bold) were used to add the His6 tag at the C-terminus. Subsequently, the same forward primer and the reverse primer 5 - GACTTGTCGAC*GCGGCCGC***TCA**ATG*G*TGATGGTGATG ATGCTCG-3 (*SalI* site*:* underlined, *NotI* site: italic, stop translation codon: bold) were used in order to add restriction sites useful for cloning.

For simplicity, we report the name of the recombinant genes *N-His*<sup>6</sup> and *M-His*<sup>6</sup> as *N* and *M* genes. The purified PCR products were cloned into the *Eco RV*-linearized pBlueScript SK(+; pBS) cloning vector (Stratagene; pBS-N and pBS-M), sequenced for authenticity and sub-cloned by *ClaI-SalI* in the pPVX201 plant vector (pPVX-N and pPVX-M) or by *EcoRI/NotI* in the pVAX1 vector (pVAX-N and pVAX-M). After XL1 blue cells transformation, pPVX-N and pPVX-M plasmid DNAs were purified (maxiprep kit, Qiagen) for plant infection, while pVAX-N and pVAX-M plasmids were purified with endotoxin-free purification kits (Plasmid Maxi kit LPS-free, Qiagen) for mammalian HEK-293 cells transfection.

The *M* gene was also sub-cloned by *XbaI/SalI* from the pBS-M construct into the pBI-121 plant binary vector (Clontech Laboratories, Acc. No. AF485783), obtaining the pBI-M construct. In this construct, the *M* gene is inserted between the CaMV 35S promoter and the NOS terminator. Characteristic components of the vector are: right and left borders (RB, LB), gene for kanamycin resistance (NPT II) under the transcription promoter and terminator of the nopaline synthase gene of *A. tumefaciens* (**Figure 6A**).

The pBI-M construct, obtained by substitution of the *GUS* gene with the *M* gene, was used to transform the GV3101 and C58C1 strains of *A. tumefaciens.*

## Expression and Purification of N and M Proteins in Bacteria

The expression and purification of the N and M proteins were done according to standard protocols (The QIAexpressionist, Qiagen). Since it was previously described that the M protein is toxic in bacteria (Carattoli et al., 2005), culture growth was performed at suboptimal temperature of 30◦C. In these conditions, we isolated a clone able to express the M protein in *E. coli*. Characterization revealed that it corresponded to a spontaneously mutated *M* gene, coding for a protein carrying the three substitutions K13 *>* R, F36 *>* L, and I160 *>* V (MRLV).

The MRLV and the N proteins were purified by Ni-NTA affinity chromatography in denaturing conditions. For the N protein, when purified in native conditions, imidazole concentration in the wash buffer was 30 mM.

Quantification of the N protein purified in denaturing and native condition was performed by Coomassie stained SDS-PAGE comparing the intensity of the band at 50 kDa to known amount of BSA using a Chemidoc ImageLab system with ImageLab 4.0 software (Bio-Rad).

#### Expression of N and M Proteins in Mammalian Cells

HEK-293 cell line was transfected in six wells plates with the pVAX-N or pVAX-M constructs according to standard protocols (Effectene Transfection Reagent, Qiagen). 24 h post-transfection (pt) some wells were added with 12.5 µM of the proteasome inhibitor MG-132. 48 h pt both samples, treated or not with MG-132, were harvested and prepared for protein expression analysis (immunoblotting and immunofluorescence). After three washes with cold phosphate-buffered saline (PBS: 21 mM Na2HPO4, 2.1 mM NaH2PO4, 150 mM NaCl, pH 7.2) cells were centrifuged at 800 rpm for 5 , recovered and re-suspended in SDS-loading buffer (10% glycerol, 60 mM Tris-HCl pH 6.8, 0.025% bromophenol blue, 2% SDS, 3% 2-mercaptoethanol) and boiled for 10 min.

For immunofluorescence analysis, HEK 293 cells were grown on multi-chamber glass slides and transfected at 40% of confluency. 24 h post transfection cells were washed three times with PBS, fixed with 4% paraformaldehyde for 10 min and permeabilized with 0.1% Triton X-100 in PBS. Samples were blocked with 5% no-fat dry milk in PBS. For detection of the N and M proteins, cells were incubated with specific polyclonal antibodies (pAb) validated in immuno-cytochemical assay on SARS-CoV infected and previously described (Carattoli et al., 2005). For N protein we used a mouse anti-N pAb, (obtained after immunization of animals with the purified His6- N protein produced in *E. coli*) at 1:800 dilution. For the M protein we used a mouse anti-M pAb (obtained by immunizing mice with the recombinant cytoplasmic domain, amino acids 138–222, of the M protein produced in *E. coli*) at 1:400 dilution. Cells were then incubated with a 1:300 dilution of an anti-mouse biotinylated secondary antibody (GE Healthcare) and a 1:50 dilution of Streptavidin-FITC conjugated (Sigma– Aldrich, GmbH, Steinheim, Germany). Nuclei were counterstained with DAPI (2 µg/ml) in PBS. Slides were examined using a fluorescence "Axiolab Zeiss" microscope (Oberkochen, Germany) interfaced with a Coolsnap CCD camera (Roper Scient., Princeton, NJ, USA).

#### Expression of N and M Proteins in *Nicotiana benthamiana*

Two leaves of *N. benthamiana* plants (4 week-old) were dusted with carborundum powder and inoculated with 10 µg of the plasmids pPVX-N, pPVX-M, or pPVX201 (empty vector, negative control) diluted in 100 µl bi-distilled water. Plants were also treated with 100 µl bi-distilled water (mock-infected) for monitoring viral infection symptoms. Plants were grown under 16 h daylight at 22◦C into a containment greenhouse (bio-safety level 2) and observed daily for PVX infection signs. Inoculated and symptomatic leaves were harvested and stored at – 80◦C until use.

Four week-old *N. benthamiana* plants were infiltrated with *A. tumefaciens* C58C1 and GV3101 cultures harboring the pBI-M construct that had been grown and induced as described (Kapila et al., 1997). Plants were subsequently grown as described and the leaf disks collected 3, 4, 5, 6, 7, and 10 days post-infiltration (dpi) were homogenized with an ultraturrax in three volumes (w/v) of SDS-loading buffer and boiled for 3 min.

To analyze N and M protein accumulation in plant tissues, soluble proteins were extracted from plant material using different buffers, depending on their hydrophobic properties. Crude plant extracts were prepared by grinding the tissue to a fine powder in liquid nitrogen. The powder was re-suspended and homogenized with an ultraturrax in three volumes (w/v) of PBS or, alternatively for the M protein, in one volume (w/v) of GB buffer (100 mM Tris-HCl pH 8.1, 10% glycerol, 400 mM sucrose, 5 mM MgCl2, 10 mM KCl, 10 mM 2-mercaptoethanol) containing protease inhibitors ("complete, EDTA-free," Roche Diagnostics, GmbH, Mannheim, Germany). Tissue homogenates were centrifuged at 4◦C, 12,000 × *g*, for 10 min. The supernatant was transferred to a fresh tube and kept on ice (or at 4◦C) until use and total soluble protein (TSP) content was estimated by the Bradford assay (Bio-Rad Inc., Segrate, Italy). Pellets were resuspended in appropriates volumes of SDS loading buffer, and prepared as described in the paragraph below, constituting the insoluble fraction of leaf extracts.

Homogenized tissues of pPVX-N infected plants were also used to inoculate *N. benthamiana* plants to propagate the infectious recombinant PVX particles until the third round of infection.

A small-scale purification of plant-produced N protein was performed. In brief, lyophilized leaf material was ground in liquid nitrogen, re-suspended and homogenized with an ultraturrax in 8 M urea, 100 mM NaH2PO4, 1 mM Tris, pH 8.0, and incubated for 1 h at RT with gentle shaking. Leaf material was finally lysed on ice by sonication at 10 Hz output (10 s each) for three times. Cell debris collected by centrifugation was discarded. The recovered supernatant was filtered (0.45 µm) and incubated for 2 h with the Ni-NTA resin. Plant N protein purification was performed by decreasing the pH (The QIAexpressionist). Protein elution was obtained at pH 4.5.

#### Immunoblotting

For immunoblotting analysis of the N protein, plant extracts containing 20 µg TSP and purified protein were boiled for 3 min in SDS-loading buffer. Immunoblotting detection of the M protein was performed on plant extracts incubated at 40◦C for 20' in SDS-loading buffer. Samples were separated by 12% SDS-PAGE and transferred onto PVDF membranes (Immobilon-P, Millipore). After membrane blocking with 5% non-fat dry milk in PBS (MPBS) over night (O/N) at 4◦C, membranes were incubated for 2 h at room temperature (RT) either with an anti-His6 monoclonal antibody (mAb; H1029-clone HIS-1, Sigma) or with specific polyclonal antibodies previously described (Carattoli et al., 2005). For N protein detection, we used a rabbit anti-N pAb, (obtained after immunization of animals with the purified His6-N protein produced in *E. coli*) at 1:3000 dilution. Immune complexes were revealed by 1 h incubation at RT with an antirabbit biotinylated secondary antibody (B8895, Sigma), at 1:5000 dilution, followed by Horseradish Peroxidase (HRP)-conjugated streptavidin (RPN1231, GE Healthcare) at 1:2000 dilution.

For M protein detection, membranes were incubated with the mouse anti-M pAb at 1:5000 dilution, followed by incubation with an anti-mouse HRP-conjugated secondary antibody at 1:10000 dilution (NA931, GE Healthcare).

For both proteins, the bound antibody was detected using the ECL Plus system ("Enhanced Chemi-Luminescence", GE Healthcare).

## Quantitative Triple Antibody Sandwich (TAS) ELISA Assay of Leaf Extracts

The amount of N protein in the plant extracts was estimated by a quantitative triple antibody sandwich (TAS) ELISA. Symptomatic systemic leaves, deriving from 15 plants infected with pPVX-N, were collected and pooled. Three independent extractions were performed and analyzed. Microtiter plates were coated with 100 µl/well of rabbit anti-N pAb (diluted 1:3000 in PBS) for 3 h at 37◦C followed by coating with 100 µl plant extract with a normalized TSP content (50 µg) for 16 h at 4◦C. Wells were then blocked with 150 µl/well of 5% MPBS at 37◦C for 3 h. The captured N protein was detected by incubating at 37◦C for 2 h with 100 µl/well of a 1:1500 dilution in 2% MPBS of a mouse anti-N pAb (obtained after immunization of animals with the purified His6-N protein produced in *E. coli*, Carattoli et al., 2005) followed by incubation with 1:10000 dilution of HRP-conjugated goat anti-mouse IgG antibody. After each step, wells were washed three times with PBS + 0.1% Tween 20 and one time with PBS. Enzymatic activity was measured by adding 100 µl/well v/v H2O2/ABTS [2 , 2 -azino bis-(3-etilbenzotiazolin) sulphuric acid] (KPL Inc., Gaithersburg, MD, USA) at RT in darkness condition. The absorbance of the samples was read after 30 at 405 nm on an ELISA microtiter plate reader. Known amounts of *E. coli-*purified N protein (0.5, 2, 5, 20, 50, 100, and 150 ng) were used as a standard. In order to give a better estimation of N protein yields in plant, the standard was diluted in *N. benthamiana* extract.

## Antigenicity Assay of the N Protein Produced in Plant with Patient Sera

A 'multi-strip' western blot assay was chosen to evaluate the antigenicity of the plant produced N protein. It is an immunoblotting procedure modified from Kiyatkin and Aksamitiene (2009), in which the sample, loaded in a single-well polyacrylamide gel (a standard gel where a single preparative well is cast), is blotted onto a membrane that is then cut into strips of the same width, each containing same amount of protein. Each strip is then incubated with different antibodies or sera and developed by a colorimetric assay (see Ciufolini et al., 1999; Di Bonito et al., 2002).

In our experiments, single-well 12% SDS-PAGE gels were loaded with one of the following preparations: (i) *E. coli* purified N protein (7.5 µg); (ii) plant-purified N protein (3.7 µg); (iii) plant extract from pPVX-N symptomatic leaves containing about 3.7 µg of N protein and 2 mg TSP; (iv) plant extract from pPVX201 symptomatic leaves containing about 2 mg TSP.

Samples (iii) and (iv) were prepared as following: the powder ground from pPVX-N or pPVX201 symptomatic leaves was resuspended and homogenized by an ultraturrax in one volume (w/v) of TN buffer (25 mM Tris-HCl pH 7.2, 150 mM NaCl) containing protease inhibitors. Tissue homogenates were centrifuged at 4◦C, 15000 × *g*, for 15 . TSP was calculated by Bradford assay and precipitated by trichloroacetic acid (TCA) at 25% final concentration. After 30' incubation in ice, samples were centrifuged, pellets were washed with cold 80% acetone, dried, re-suspended in SDS loading buffer. Gels were blotted using a semi-dry system (BIORAD) onto PVDF membranes.

After blocking with 5% MPBS, membranes were dried at RT and cut in 15 strips, 0.5 cm width. Each strip of samples (ii) and (iii) contained approximately 250 ng of the N protein (as plantpurified protein or in plant extract, respectively). As controls, strips containing 500 ng of *E. coli*-produced N protein [sample (i)], were prepared, as well as strips of plant extracts without the N protein [extract from pPVX201 symptomatic leaves, sample (iv)]. Before incubation with human sera, one strip from either plant-derived or *E. coli-*expressed N protein was incubated with the mouse anti-N pAb, as previously described (Carattoli et al., 2005), to confirm protein presence and to check that the amount of protein in the strips was sufficient for colorimetric detection.

The strips were incubated with pools of SARS sera collected during the SARS outbreak in Hong Kong in 2003 (Chan et al., 2005; 86 SARS patients pooled in seven groups) at 1:100 dilution in 3% MPBS, O/N at 4◦C. Blood samples were collected with informed written consent. The study was approved by the institutional human research ethics committee (The Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee).

As non-SARS controls, strips were incubated with three pools of sera from 30 patients affected by unrelated respiratory diseases (Carattoli et al., 2005). Antigen-antibodies complexes were revealed by a rabbit anti-human IgG (H + L) HRPconjugate (Southern Biotechnology Associates, Inc, Cat.No.6145- 05) diluted 1:5000 in 3% MPBS. After 1 h of incubation at RT colorimetric reaction on the strips was induced by adding 3,3 - Diaminobenzidine tetrahydrochloride substrate, DAB (Sigma D-5637) and Hydrogen peroxide.

# RESULTS

## Expression and Purification of Plant-Derived SARS-CoV N Protein

To produce the recombinant SARS-CoV N protein, *N. benthamiana* plants (4 weeks old) were infected with the pPVX-N DNA plasmid harboring the N gene (**Figure 1A**). As a control, plants were either mock infected or infected with the wild type pPVX201 vector. While mock infected plants

FIGURE 1 | Potato Virus X-mediated expression of SARS-CoV N protein in *Nicotiana benthamiana* plants. (A) Schematic representation of the pPVX-N construct used for the expression of SARS-CoV N protein in *N. benthamiana* leaves. CaMV 35S: constitutive 35S promoter from the cauliflower mosaic virus; Nos-ter: transcription terminator of the nopaline synthase gene of *Agrobacterium tumefaciens*; RdRp: PVX replicase gene; M1-3: PVX triple gene block for cell-to-cell movement; CP: PVX coat protein gene; SgP: coat protein sub-genomic promoter; N: SARS-CoV N gene; His6-tag: histidine tag. (B) Immunoblotting of TSP extracted from *N. benthamiana* plants infected with pPVX-N. For each sample, 20 µg TSP were loaded on gel. Lane 1: N protein purified from *Escherichia coli* under native conditions (40 ng); lane 2: pPVX-N inoculated leaves; lane 3: pPVX201 inoculated leaves (negative control); lane 4: pPVX-N symptomatic systemic leaves; lane 5: pPVX201 symptomatic systemic leaves (negative control); lane 6: molecular weight marker (Magic Mark, Invitrogen); lane 7: pPVX-N symptomatic systemic leaves extract stored at −20◦ C for 2 months. Immunoblotting was performed with the rabbit anti-N pAb.

showed no symptoms, typical symptoms (mainly chlorotic spots) generally appeared on the inoculated leaves of plants infected with pPVX-N or pPVX201 vectors 4–5 days post inoculation (dpi). The infection spread systemically to apical leaves, where symptoms appeared 7–10 dpi.

To examine whether the N protein accumulated in infected plants, soluble protein extracts were analyzed by ELISA and immunoblotting.

Interestingly, after infection with the pPVX-N plasmid, protein expression corresponded to symptoms in systemic leaves in all plants analyzed (about 100). This result indicates that the construct is stable and that the recombinant virus PVX-N can spread systemically within the inoculated plant. Immunoblotting of soluble protein extracts from plants infected with pPVX-N reveals a single band of about 50 kDa in systemic leaves (**Figure 1B**), suggesting the absence of proteolysis. On the contrary, when the N protein is expressed in *E. coli*, additional bands of lower molecular mass are present (**Figure 1B**). These bands probably correspond to protein degradation, in accordance with previous work demonstrating the intrinsic instability and/or autolysis of this protein when expressed in bacteria (Mark et al., 2008). Furthermore, pPVX-N-derived extracts, stored at – 20 ◦C for 2 months and then thawed (**Figure 1B**), as well as extracts obtained from freeze-dried pPVX-N-infected leaves showed an intact N protein (data not shown). Immunoblotting of mammalian cells transfected with the pVAX-N plasmid also revealed the presence of a single band of approximately 50 kDa corresponding to the N protein (**Figure 2A**). Immunofluorescence revealed a cytoplasmic localization of the N protein (**Figure 2B**).

Plant extracts deriving from PVX-N-infected leaves were also used for subsequent inoculations. N protein expression was confirmed at least until the third cycle of re-infection (Supplementary Figure S1), further demonstrating the stability of the construct and of the recombinant virion.

The amount of recombinant N protein expressed in leaves, as measured by TAS-ELISA, was approximately 3–4 µg/g fresh leaf weight, corresponding to 0.2% TSP (**Figure 3**).

Although the N protein accumulated mainly in the soluble fraction in all the expression systems used (plant, bacteria and mammalian cells), for purification the best recovery of the N protein from *E. coli* was obtained in denaturing conditions (3 mg of protein/liter of culture under denaturing conditions versus 0.4 mg protein/liter of culture under native conditions, **Figure 4A**). Therefore, we decided to perform N protein purification in denaturing conditions also from plant tissue. We performed a small-scale purification by loading plant extracts derived from freeze-dried leaves (obtained from a pool including primary-infected and re-infected systemic leaves) on a Ni-NTA

FIGURE 2 | Expression of SARS-CoV N protein in mammalian cells. (A) Immunoblotting of total proteins extracted from HEK-293 cells transfected with the pVAX-N plasmid. For each sample, total proteins from 1 <sup>×</sup> 105 cells were loaded on gel. Lane 1: N protein purified from *E. coli* under native conditions (20 ng); lanes 2, 3: HEK-293 cells, 24 h post-transfection with the pVAX-N plasmid, with or without the addition of proteasome inhibitor MG-132, respectively; lane 4: negative control, HEK-293 cells transfected with the pVAX empty vector. Immunoblotting was performed with the rabbit anti-N pAb. (B) Immunofluorescence of HEK-293 cells transfected with the pVAX-N plasmid (N) or with the pVAX empty vector [(−) CTR]. Immunofluorescence was performed with the rabbit anti-N pAb and nuclei were counter-stained with DAPI (100x objective, ZEISS ACHROSTIGMAT 100x/1,25 oil). Scale Bar = 5 µm.

(recorded at 30') of the following samples: bar 1: pPVX201 infected systemic leaves (negative control); bar 2: pPVX-N systemic leaves; bars 3–9: purified N protein produced in *E. coli* (0.5, 2, 5, 20, 50, 100, and 150 ng, respectively) diluted in *N. benthamiana* extract. Error bars represent standard deviations of three technical replicates, i.e., three separate extractions from a pool of leaves from 15 plants (pPVX201 or pPVX-N).

(Prestained Marker, Invitrogen); lanes 2–3: first and second elution fractions (purification performed under denaturing conditions); lane 4, 5, 6: BSA 1, 2, and 5 µg, respectively; lanes 7–8: first and second elution fractions (purification performed under native conditions). (B) Silver-stained SDS-PAGE of the N protein purified from *N. bethamiana* by affinity chromatography. Lane 1: N protein purified from *E. coli* under denaturing conditions (50 ng); lanes 2–3: first and second elution fractions of N protein purification from *N. benthamiana* (purification performed under denaturing conditions).

affinity purification column. In this way, we obtained yields of about 1 µg of purified N protein/g of fresh leaf weight (**Figure 4B**).

## Antigenicity of Plant-Derived SARS-CoV N Protein

The ELISA and immunoblot analysis gave a first indication of antigenic features of the plant-derived N protein. In fact, in such analysis the N protein was specifically recognized by rabbit and mouse anti-His6-N hyper-immune sera that had previously been shown to recognize the N protein in SARS-CoV infected cells (Carattoli et al., 2005).

To further characterize the N protein expressed in plant, we analyzed its reactivity with sera from SARS patients, collected during the SARS outbreak in Hong Kong in 2003. These SARS sera were previously screened by an ELISA assay with an *E. coli*expressed N protein (Di Bonito and Chan, data not shown). Here, to evaluate SARS-positive sera reactivity with N plant expressed protein, we used a 'multi-strip' western blot assay. In a first experiment, we evaluated the reactivity of the purified N protein with a pool of 5 SARS sera. As shown in **Figure 5A**, both preparations of purified N protein, from *E. coli* and from plant, are recognized by SARS sera as well as by the mouse anti-N pAb (positive control). Then, we validated the results obtained for the plant-purified N protein analyzing its reactivity with other six groups of SARS sera deriving from 86 patients and with three groups of sera from 30 patients affected by non-SARS respiratory diseases (**Figure 5B**). In this experiment, we observed that the N protein purified from plant is specifically recognized by all the groups of SARS sera analyzed, while no reactivity was observed with sera from patients affected by other respiratory diseases.

We also tested the reactivity of SARS and non-SARS sera with the unpurified N protein (soluble extract of pPVX-N symptomatic leaves) and with the extract from plant infected with the pPVX201 empty vector. A light reactivity with SARS sera was observed only for the unpurified N protein, while no reactivity was observed for the same preparation when using non-SARS sera (Supplementary Figure S2). Importantly, the pPVX201 leaf extract did not react with any human sera (Supplementary Figure S2).

To our knowledge, this is the first report of the antigenic properties of a plant-derived N protein as revealed by direct serology using SARS patient sera, suggesting its possible use for the development of SARS diagnostic assays.

# Expression of Plant-Derived SARS-CoV M Protein

We investigated the ability of plant expression systems to cope with the synthesis of M protein. We started our studies on M protein expression in *N. benthamiana* plants (4-week old) by infection with the pPVX-M plasmid harboring the *M* gene as described for the N protein. Also in this case typical symptoms generally appeared 4–5 dpi on inoculated leaves and 7–10 dpi on systemic leaves. However, M protein production was reported in just 2 plants out of 100 analyzed at detectable levels (data not shown).

Hence, we investigated the possibility to obtain the M protein by agroinfiltration. *N. benthamiana* leaves were infiltrated with *A. tumefaciens* suspensions (strains C58 and GV3101) harboring the pBI binary vector containing the *M* gene (pBI-M vector, **Figure 6A**). With this simple technology, and without the use of any post-transcriptional gene silencing suppressors, we were able to detect SARS-CoV M protein production in plant. The time course experiment revealed that the M protein is expressed mainly at 4 and 5 dpi (data not shown). For immunoblotting analysis, the M protein was extracted with an appropriate buffer (GB buffer) and, since it was previously reported that boiling causes M protein aggregation and precipitation (Lee et al., 2005), samples boiling was avoided. Interestingly, the M protein

in *E. coli* (500 ng); strips 2, 4: purified N protein produced in plant (250 ng). Strips 1 and 2: probed with a pool of 5 SARS patient sera. Strips 3 and 4: probed with the mouse anti-N pAb (positive control). (B) Validation of the results shown in panel A for the N protein produced in plant. Each strip derives from the blotting of a single-well SDS-PAGE gel (showed in C) loaded with 3.7 µg of purified N protein produced in plant. Each strip contains approximately 250 ng of N protein. Strip 1 was probed with the mouse anti-N pAb (positive control). Strips 2–7: probed with SARS patient sera (different pools deriving from several SARS patients). Strips 6–8: probed with pools of sera deriving from patients affected by non-SARS respiratory diseases (negative control). (C) Replica gel of the experiment shown in panel B stained by Coomassie Brilliant Blue for loading control of the purified N protein from plant.

accumulated almost totally in the soluble fraction of the plant extract (**Figure 6B**). As described in the previous paragraph for the N protein, to better characterize the plant-derived M protein we worked, at the same time, on M protein expression in bacteria and mammalian cells. Due to toxic properties of M protein for *E. coli* (Carattoli et al., 2005), we performed colony selection and M protein expression under sub-optimal growth conditions (30◦C). In this way, the M protein, with a mass of about 25 kDa was purified by Ni-NTA affinity chromatography in denaturing conditions, obtaining yields of about 100 µg/l culture (data not shown). DNA sequence analysis revealed the presence in the *M* gene of three spontaneous point mutations: R13 *>* K in the N-terminal domain, L36 *>* F in the first transmembrane domain and V160 *>* I in the cytoplasmic domain (MRLV). As the plant-derived recombinant M protein, the MRLV was also specifically recognized by the mouse anti-M pAb (**Figure 6C**) that had previously validated by Immunofluorescence Antibody Assay (IFA) in SARS CoV infected Vero cells (Carattoli et al., 2005). Although we were not able to express the M protein with its original aa sequence in bacteria, the MRLV protein was useful as standard for the plant-produced M protein characterization. Immunoblotting revealed that, contrary to prokaryotic cells, the plant system allowed the expression of the full-length M protein, especially by the use of the *Agrobacterium* C58C1 strain (**Figure 6B**). The M protein yield, calculated by immunoblotting using the quantified MRLV protein purified from *E. coli* as standard, was estimated to be 0.1–0.15% TSP. The mobility of the plant-derived M protein in SDS-PAGE was reduced compared to the mutated bacterial form (**Figure 6C**), suggesting a higher molecular mass of plant-expressed compared to the *E. coli*expressed M protein.

No attempts to purify the M protein from plant were done due to the expression estimates.

The M protein was also expressed in mammalian cells. Immunofluorescence on M-transfected HEK-293 cells using the mouse anti-M pAb revealed that M protein is primarily localized in the plasma membrane (**Figure 7**). Similar results were previously reported by Tseng and collaborators (Tseng et al., 2010), who described a perinuclear and plasma membrane localization of the M protein expressed in various cell lines. However, we did not observe M protein expression by immunoblotting, either in transfected cells or in the culture medium (not shown), probably because of its low expression level, as we reported in *N. benthamiana* plants by PVX-mediated infection.

#### DISCUSSION

The importance to develop effective therapeutic and preventive strategies, to be readily applied to new emergent pathogens is established by the two novel coronaviruses that have emerged in humans in the twenty-first century: severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). Both viruses cause acute respiratory distress syndrome and are associated with high mortality rates. There are no clinically approved vaccines or antiviral drugs available for either of these infections, thus their development represents a research priority (Graham et al., 2013).

Severe acute respiratory syndrome coronavirus was the first massive infectious disease outbreak and it still has the potential to cause large-scale epidemics in the future. The key to preventing and controlling a future outbreak of SARS is to develop rapid and specific diagnostic methods so that suspected

patients can be correctly assessed. Moreover, effective and safe treatment/vaccination will be extremely important in minimizing the damage of a new pandemic. Enormous efforts have been undertaken to these purposes. The four major diagnostic methods available for SARS include viral RNA detection by RT-PCR, virus induced antibodies by immunofluorescence assay (IFA), or by enzyme linked immunosorbent assay (ELISA) of nucleocapsid protein (N) and inoculation of patient specimens in cell culture (World Health Organization [WHO], 2003 SARS: Laboratory diagnostic tests).

The SARS-CoV N protein, expressed at early stage of infection and triggering a strong antibody response by the host, is considered to be the best diagnostic target (Surjit and Lal, 2008). Plasmon resonance-based biosensors (Huang et al., 2009; Park et al., 2009) and nanowire/carbon nanotube transistors (Ishikawa et al., 2009) have been developed for the detection of SARS-CoV N protein in patient sera. Such sensors offer real-time detection of nanomolar concentrations of the protein. Nevertheless, SARS tests should also have other useful features such as cost-effectiveness and ease of operation. Moreover, combinations of antigens may be necessary to provide a definitive diagnosis of SARS in humans and susceptible animals (Roper and Rehm, 2009).

The production of recombinant N protein has been achieved in a variety of heterologous expression systems. A synthetic gene with optimized codons has been expressed in *E. coli* at high yield (Das and Suresh, 2006) but it was demonstrated that bacterially expressed N protein produces false seropositivity owing to interference of bacterially derived antigens (Leung et al., 2006; Yip et al., 2007; Surjit and Lal, 2008) or cross-reacts with antisera of human coronaviruses (HCoV-OC43 and HCoV-229E)- infected patients (Woo et al., 2004). An interesting study correlated the phosphorylation state of the N protein with its antigenicity and specificity of antibodies recognition (Shin et al., 2007). These data underline the importance of producing the recombinant protein in eukaryotic platforms such as insect cells, yeast, or plants to set up more efficient and specific diagnostic tests. To date, it was demonstrated that the N protein produced in insect cells may be useful for the development of highly sensitive and specific assays to determine SARS infection (Shin et al., 2007). Later, the N protein was transiently expressed in plant by agroinfiltration, and its antigenicity was demonstrated in mice (Zheng et al., 2009) giving a proof of concept of its use in vaccine formulations. Previously, the S1 domain of SARS-CoV S protein had been stably expressed in tomato and low-nicotine tobacco plants obtaining about 0.1% TSP (Pogrebnyak et al., 2005). The plant-derived antigen was able to induce systemic and mucosal immune responses in mice.

While previous works have primarily focused on the S and N proteins, there is growing evidence of the potential of the M antigen both as an effective vaccine and diagnostic candidate. Thus, the obtainment of the recombinant full-length M protein in an eukaryotic expression system also represents a good way to develop an effective and safe SARS-CoV vaccine. In addition to the knowledge that the M protein elicits a strong humoral responses, and that a specific humoral and cellular immune response can be obtained by co-expressing S, M and E (Lu et al., 2007), it has been demonstrated that the M protein also contains T cell epitopes (Liu et al., 2010). The availability of recombinant M protein, in combination with other viral proteins might overcome the concern about the sensitivity and the specificity of N nucleoprotein-based assay, as described when using the N and the S proteins together (Woo et al., 2004; Haynes et al., 2007; Gimenez et al., 2009). This would help the development of more efficient reagents to detect antibodies in the infected human host.

Here, we propose the use of plants as an alternative system to produce the N and M antigens that could be useful to formulate new vaccines and diagnostic assays against SARS.

Since plant transformation and regeneration of stable transformants require considerable time, we used transient expression systems (PVX and agroinfiltration) to evaluate the ability of the plant expression system to cope with the synthesis of the SARS-CoV M and N proteins.

The N and M full-length genes of the human SARS-CoV Frankfurt I isolate were cloned, without codon optimization, into different expression vectors. For the SARS-CoV N protein, we assessed the successful ectopic expression in *N. benthamiana* plants by pPVX-mediated infection (**Figure 1**). We were able to obtain the N protein in systemic leaves in most primaryinfected plants as well as in 100% re-infected plants. These results demonstrate the stability of the construct, a condition not easily obtained especially when large sequences are inserted in the pPVX-derived expression cassette (the *N* gene is about 1300 bp). In fact, several studies report that the use of 'first generation' plant viral vectors (like the pPVX series) for the expression of proteins in plants can lead to insert elimination by natural selection over replication cycles as early as the first infection passage with a positive correlation between insert length and elimination rate (Avesani et al., 2007).

Unlike the observed prokaryotic expression pattern, no proteolysis products were detected in immunoblotting by using polyclonal sera in pPVX-N-derived extracts, fresh or stored at – <sup>20</sup>◦C for 2 months (**Figure 1B**). These data demonstrate the stability of the recombinant N protein when transiently expressed in *N. benthamiana*. The same stability was observed when the N protein was expressed in mammalian cells, even in the absence of proteasome inhibitor (**Figure 2A**).

The purified plant-produced N protein is specifically recognized by sera from Chinese SARS patients of the 2003 outbreak, and not from patients affected by unrelated respiratory diseases (**Figure 5**). This result suggests that the plant-expressed SARS-N protein is suitable in SARS diagnosis.

It is interesting to note that, when using crude plant extracts, SARS human sera reveal a weak band, corresponding to the molecular weight of the N protein, without any cross-reaction with other components of the plant extract (Supplementary Figure S2). Taken together, the results indicate that plant-derived N protein is specifically detected by antibodies of SARS patients, using an assay where the antigen-antibody complex is revealed by a colorimetric method, less sensitive but more specific and suitable for hospital clinical laboratories. It should be noted that the only information available to date about plant-derived SARS-CoV N protein antigenicity were from Zheng et al. (2009) who immunized mice and evaluated the impact on the regulation of cytokines and on the elicited IgG subclasses. Our study is the first demonstration by direct serology that plant-derived N protein is able to reveal human N-specific antibodies present in sera of SARS patients, thus providing an adequate instrument to develop a rapid, low-cost, immune-based diagnostic assay to be used as an alternative or in association to molecular diagnosis.

For the M protein, we obtained a spontaneously mutated MRLV protein in *E. coli.* This allowed to overcome toxicity of the wild type protein when expressed in bacteria (Carattoli et al., 2005) and to purify the MRLV protein that was then used as positive control in our experiments. As the SARS-CoV M protein is difficult to express in recombinant form, the exact structure and function of the protein are not fully elucidated (Neuman et al., 2011; Tseng et al., 2013; Siu et al., 2014).

Unlike prokaryotic cells, the plant system allowed the expression of the full-length M protein by using *A. tumefacien*s (in particular the strain C58C1), demonstrating for the first time the possibility to express the SARS-CoV M protein in plants, without the need of codon optimization or addition of any further modifications. On the contrary, by PVX-mediated infection we observed very low expression levels for the M protein, only in 2 plants out of 100 analyzed. We can speculate that membrane M protein, that already proved to be toxic in *E. coli*, may be difficult to express by using a 'living vector' like PVX due to interference with virus replication and/or expression/assembly of viral components, while it is tolerated by the plant when expressed by agro-infiltration.

Compared to the mutated MRLV protein produced in *E. coli*, the M protein produced in plant shows a reduced electrophoretic mobility, suggesting a higher molecular mass (**Figure 6C**). The reason for this difference remains to be elucidated, but it could be due to modified residues in the MRLV protein or by the presence of glycosylation in the M protein produced in the eukaryotic system (the native protein is N-glycosylated at the fourth residue).

The plant-produced M protein yields were not sufficient to perform the test with human sera. Thus, for plant-derived M protein characterization, efforts should be made in order to enhance expression yields. This includes the use of secondgeneration viral vectors (Gleba et al., 2007) or other implemented platforms that have been developed in the last years (Moustafa et al., 2015; Paul et al., 2015; Sack et al., 2015).

# CONCLUSION

Our results add further insights to the characterization of the N and M proteins and provide a proof of principle for using plants as a robust, rapid and flexible production system for protein reagents suitable to face potential recurring SARS-CoV outbreaks.

#### AUTHOR CONTRIBUTIONS

RF, PDB concept and design of the research, analysis and interpretation of data, writing and revising the article, final approval of the version to be published. OCD, SM design of the research, acquisition of data, analysis and interpretation of data, drafting and writing the article. EI acquisition of data, analysis and interpretation of data. DDM writing and revising the article critically for important intellectual content. PKSC design and revising the article critically for important intellectual content and final approval of the version to be published.

#### FUNDING

The work was partially supported by the Italian 'Ministero della Salute,' grant 'Progetto Speciale Lotta alla SARS.'

#### REFERENCES


#### ACKNOWLEDGMENTS

We thank Dr. Alessandra Carattoli from 'Istituto Superiore di Sanità' for coordinating the SARS project. We also thank Orsola Bitti for technical assistance.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00054

syndrome-associated coronavirus-specific antibody. *Clin. Vacc. Immunol.* 6, 241–245. doi: 10.1128/CVI.00252-08


phosphorylation on immunoreactivity and specificity. *Virus Res.* 127, 71–80. doi: 10.1016/j.virusres.2007.03.019


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Demurtas, Massa, Illiano, De Martinis, Chan, Di Bonito and Franconi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Global Identification of the Full-Length Transcripts and Alternative Splicing Related to Phenolic Acid Biosynthetic Genes in *Salvia miltiorrhiza*

*Zhichao Xu1, Hongmei Luo1, Aijia Ji1, Xin Zhang1, Jingyuan Song1\* and Shilin Chen1,2\**

*<sup>1</sup> Institute of Medicinal Plant Development – Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China, <sup>2</sup> Institute of Chinese Materia Medica – Chinese Academy of Chinese Medical Science, Beijing, China*

#### *Edited by:*

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Jianbo Xiao, University of Macau, China Laura Bassolino, CREA, Italy Shujuan Zhao, Shanghai University of Traditional Chinese Medicine, China*

#### *\*Correspondence:*

*Jingyuan Song jysong@implad.ac.cn; Shilin Chen slchen@icmm.ac.cn*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 21 September 2015 Accepted: 19 January 2016 Published: 05 February 2016*

#### *Citation:*

*Xu Z, Luo H, Ji A, Zhang X, Song J and Chen S (2016) Global Identification of the Full-Length Transcripts and Alternative Splicing Related to Phenolic Acid Biosynthetic Genes in Salvia miltiorrhiza. Front. Plant Sci. 7:100. doi: 10.3389/fpls.2016.00100*

Salvianolic acids are among the main bioactive components in *Salvia miltiorrhiza*, and their biosynthesis has attracted widespread interest. However, previous studies on the biosynthesis of phenolic acids using next-generation sequencing platforms are limited with regard to the assembly of full-length transcripts. Based on hybrid-seq (next-generation and single molecular real-time sequencing) of the *S. miltiorrhiza* root transcriptome, we experimentally identified 15 full-length transcripts and four alternative splicing events of enzyme-coding genes involved in the biosynthesis of rosmarinic acid. Moreover, we herein demonstrate that lithospermic acid B accumulates in the phloem and xylem of roots, in agreement with the expression patterns of the identified key genes related to rosmarinic acid biosynthesis. According to co-expression patterns, we predicted that six candidate cytochrome P450s and five candidate laccases participate in the salvianolic acid pathway. Our results provide a valuable resource for further investigation into the synthetic biology of phenolic acids in *S. miltiorrhiza*.

Keywords: *Salvia miltiorrhiza*, hybrid-seq, full-length transcripts, phenolic acid biosynthesis, alternative splicing, cytochrome P450s, laccases

# INTRODUCTION

The alternative splicing events of mutiexon genes in multicellular eukaryotes can enhance the functional diversity of the encoded proteins and regulate gene expression through complex posttranscriptional mechanisms (Reddy et al., 2013). Recent alternative splicing analysis originating from next-generation sequencing (NGS, Illumina) has revealed that over 60% of multiexon genes undergo alternative splicing events in plants, such as *Arabidopsis thaliana* (Marquez et al., 2012), *Glycine max* (Shen et al., 2014), *Brachypodium distachyon* (Walters et al., 2013), and *Oryza sativa* (Zhang et al., 2010). However, the short-read assembly strategy of NGS limits its capacity to precisely quantify and predict alternative splicing events. In contrast, the long reads from SMRT sequencing (single molecule, real-time DNA sequencing using Pacific Biosciences RS II, PacBio) have demonstrated their advantage in sequencing full-length transcripts to identify and predict alternative splicing isoforms in human embryonic stem cells (Au et al., 2013; Roberts et al., 2013). Research has also addressed the disadvantage of high sequencing errors by correction with high-quality NGS reads (Au et al., 2012; Koren et al., 2012). Our recent study successfully

demonstrated the localization of tanshinones to the root periderm and revealed the molecular mechanism of tanshinone biosynthesis using hybrid-seq (next-generation and single molecular real-time sequencing, NGS and TGS) of the root transcriptome of *Salvia miltiorrhiza* (Xu et al., 2015).

*Salvia miltiorrhiza* Bunge is one of the most commonly used medicinal plants in Traditional Chinese Medicine (TCM), as its dried root or rhizome is of great phytochemical value in the treatment of cardiovascular diseases and inflammation, and as an anti-oxidant, among other uses (Cheng, 2006; Wang et al., 2007; Dong et al., 2011). The main active components of *S. miltiorrhiza* are hydrophilic salvianolic acids (SAs), such as rosmarinic acid (RA) and lithospermic acid B (LAB; Wang et al., 2007), and lipophilic diterpenoid components, such as tanshinones I/IIA, dihydrotanshinone, and cryptotanshinone (Lei et al., 2014). The availability of the nuclear and chloroplast genomes (Qian et al., 2013) and transcriptome (Hua et al., 2011; Luo et al., 2014), along with research related to the molecular regulation (Zhang et al., 2013, 2015; Tan et al., 2014; Li et al., 2015) and biosynthesis of its bioactive components (Guo et al., 2013, 2015; Bloch and Schmidt-Dannert, 2014), strongly favors *S. miltiorrhiza* as a potential model medicinal plant for TCM research.

There are two pathways for RA synthesis, namely, the phenylpropanoid pathway and the tyrosine-derived pathway, and many of the key genes encoding enzymes in *S. miltiorrhiza* have been identified (Di et al., 2013; Hou et al., 2013; Bloch and Schmidt-Dannert, 2014). In the phenylpropanoid pathway, phenylalanine ammonia lyase (PAL), cinnamate 4-hydroxylase (C4H), and 4-coumarate-CoA ligase (4CL) sequentially catalyze the conversion of L-phenylalanine into 4-coumaroyl-CoA. In the tyrosine-derived pathway, tyrosine aminotransferase (TAT) and 4-hydroxyphenylpyruvate reductase (HPPR) sequentially catalyze the conversion of L-tyrosine into 4-hydroxyphenyllactic acid, which is then catalyzed into 3,4-dihydroxyphenllactic acid by an unknown CYP450 in *S. miltiorrhiza* (Di et al., 2013). Rosmarinic acid synthase (RAS) catalyzes conversion of the products from the two pathways to form 4-coumaroyl-3- , 4- -dihydroxyphenllactic acid (4C-DHPL), and SmCYP98A78 (allelic variant of SmCYP98A14, Chen et al., 2014) has been indicated as the specific hydroxylase that catalyzes the conversion of 4C-DHPL to RA (Di et al., 2013). Finally, oxidative dimerization of hydroxystilbene occurs, and laccase has been proposed to catalyze the oxidative reaction from RA to LAB (Giardina et al., 2010; Di et al., 2013). Although the phenolic acid biosynthetic pathway has in essence been proposed and identified, many homologous genes encoding key enzymes were uncovered by genome-wide strategy. Indeed, a total of 28 homologous genes of *SmPAL*, *SmC4H*, *Sm4CL*, *SmTAT*, *SmHPPR*, *SmRAS,* and *SmCYP98A78* have been identified by genome annotation (Wang et al., 2015).

In this study, using the hybrid-seq transcriptome of *S. miltiorrhiza* roots, we systematically analyzed the full-length transcripts and alternative splicing events of these 28 gene loci predicted as being related to RA biosynthesis. We then analyzed co-expression patterns and predicted candidate CYP450s and laccases related to the SA pathway. Our experiments not only reveal full-length transcript and alternative splicing data but also provide a reference tool for future studies on the genes involved in the biosynthesis of phenolic acids.

# MATERIALS AND METHODS

#### Plant Resources

*Salvia miltiorrhiza* (line 99-3) plants were cultivated at the Institute of Medicinal Plant Development (IMPLAD), Chinese Academy of Medical Sciences (CAMS) in an open experimental field. Roots, stems, and flowers were collected from 3-yearsold plants growing in the field on May 27th, 2014. The roots were separated into three parts (periderm, phloem, and xylem) according to morphology and microstructure. Leaves with and without MeJA treatment (12 h, 200 µM; Sigma-Aldrich, St. Louis, MO, USA) were collected from tissue culture *S. miltiorrhiza* (line 99-3) plantlets at 25◦C under long-day condition of 16-h light/8 h dark. All of the collected tissues originated from the same clone of *S. miltiorrhiza* (line 99-3).

# Transcriptomic Data

Single molecule real-time DNA sequencing data from pooled root tissues (periderm, phloem, and xylem) using the PacBio RS II platform (Pacific Biosciences of California, USA; Accession, SRX753381) and RNA-seq reads from different root tissues (periderm, phloem, and xylem) using the Illumina Hiseq 2500 platform (Illumina, USA) are reported in our recent study (Xu et al., 2015; Accession, SRR1640458). RNA-seq reads for different organs (root, stem, and flower) were generated using the Illumina HiSeq 2000 platform (Illumina, USA; Accession, SRP028388), and Illumina reads from leaves with and without 12 h MeJA treatment were obtained in a previous study (Luo et al., 2014; Accession, SRP051564).

# Bioinformatic Analysis

Single molecule real-time DNA sequencing data were corrected with Illumina short reads using LSC 1.alpha software (Au et al., 2012). Alternative splicing isoforms were analyzed using IDP 0.1.7 software, employing SMRT sequencing reads, Illumina short reads, and genome scaffolds (Au et al., 2013). Differential gene expression in various root tissues, organs and under MeJA treatment was analyzed using Tophat 2.0.12 and Cufflinks 2.2.1 (Trapnell et al., 2012) by mapping the Illumina short reads to *S. miltiorrhiza* genome sequences. Heat maps were constructed using R statistical project (Gentleman et al., 2004).

# Gene Structures and Phylogenetic Analysis

The alternative splicing isoforms found by IDP were viewed using the IGV 2.3.34 software (http://www.broadinstitute.org/ software/igv/). The annotated gene sequences were corrected with the SMRT sequencing reads using Apollo software (Lee et al., 2013). Gene structures (e.g., intron, exon, intron phase) were also analyzed with Apollo. The full-length amino acid sequences of candidate CYP450s and laccases from *S. miltiorrhiza* and

other species were aligned with MEGA 6 (Tamura et al., 2013). Neighbor-joining trees were then constructed using the bootstrap method with 1,000 replications.

#### UPLC Analysis of LAB Content

The detection methods followed the Pharmacopeia of the People's Republic of China. Periderm, phloem, and xylem samples were ground into powder (with three biological replicates for each sample), and each weighed sample of ground powder (0.2 g) was extracted with 50 mL of 75% methanol. After 1 h of heating reflux extraction, 75% methanol was added to complement and maintain a constant weight, and the sample was filtered through a 0.45-µm syringe filter. In addition, an LAB standard was dissolved with 75% methanol at a concentration of 140 mg/L. Chromatographic separation was performed using an ACQUITY UPLC BEH C18 column (2.1 mm × 100 mm, 1.7 µm) with a mobile phase of 30% methanol, 10% acetonitrile, 1% methanoic acid, and 59% H2O in a Waters UPLC system (Waters, USA). The detection wavelength was set to 286 nm.

#### Gene Expression Analysis by qRT-PCR

Nine RNA samples were isolated from different *S. miltiorrhiza* tissues (periderm, phloem, xylem, root, stem, leaf, and flower), which were collected from experimental field, and leaves from tissue culture plantlets were treated with MeJA (control or 12-h MeJA treatment). Total RNA (three biological replicates for each sample) was isolated using the RNeasy Plus Mini kit (Qiagen, Germany). Reverse transcription was performed with PrimeScriptTM Reverse Transcriptase (TaKaRa, Japan). The qRT-PCR primers were designed with Primer Premier 6 (Supplementary Table S1), and their specificity was verified by PCR. The qRT-PCR analysis was performed in triplicate using SYBR<sup>R</sup> Premix Ex TaqTM II (TaKaRa, Japan), with *SmActin* as a reference gene, and a 7500 real-time PCR system (ABI, USA). The Ct value was calculated for analyzing relative expression levels using the 2−--CT method (Livak and Schmittgen, 2001). To detect differences in the expression of candidate genes among various tissues, one-way ANOVA was performed using IBM SPSS 20 software (IBM Corporation, USA). *P* < 0.01 was considered highly significant. Gene co-expression analysis of candidate genes was performed using Pearson's correlation test.

# RESULTS

# Localization of SA Accumulation in *S. miltiorrhiza* Root

The rhizome or root of *S. miltiorrhiza* is the primary medicinal part of this well-known herb. The hydrophilic phenolic acids in the *S. miltiorrhiza* root are mainly distributed in the phloem and xylem. UPLC identification demonstrated a similar LAB content in the phloem and xylem, which were five times higher than that in the periderm (**Figures 1A,B**). These results provided a

potential basis for co-expression analysis of SA biosynthetic genes in the *S. miltiorrhiza* root.

# Isoform Detection and Prediction of RA Biosynthetic Genes

Using the next-generation sequencing platform (Illumina), RNAseq data (a total of 867,864,885 reads) from *S. miltiorrhiza* periderm, phloem, xylem, root, stem, flower, leaf, and leaf after 12 h of MeJA treatment were collected. Using SMRT sequencing (PacBio RS II platform), full-length cDNA libraries from pooled periderm, phloem, and xylem samples were performed for a longread survey, and 796,011 subreads were employed to identify fulllength transcripts and alternative splicing events by hybrid-seq using the IDP (isoforms detection and prediction) pipeline.

A total of 28 candidate genes from the phenylpropanoid pathway and tyrosine-derived pathway, related to RA biosynthesis, were selected based on a genome-wide approach; these gene included *SmPALs* (3), *SmC4Hs* (2), *Sm4CLs* (10), *SmTATs* (3), *SmHPPRs* (3), *SmRASs* (6), and *SmCYP98A78* (Supplementary Table S2 and **Figure 1C**). The same approach was previously used to detect tanshinone biosynthetic genes (Xu et al., 2015). Fifteen gene loci were detected as full-length transcripts (Supplementary Figure S1), and their gene structures and intron phases are described in **Figure 2A**. *SmC4H2* might be a duplicated pseudogene of *SmC4H1* with an N-terminal deletion, as *SmC4H2* exhibits 74% homology with *SmC4H1* and is located at a distance of 7.5 kb from *SmC4H1* in the genome. *Sm4CL2*, *Sm4CL-like5*, *Sm4CL-like7,* and *SmTAT1* were identified as expressing alternatively spliced isoforms (**Figure 2B**), and all of the alternatively spliced junctions were characterized as intron retention. *Sm4CL2* and *Sm4CL-like5* each expressed two isoforms, whereas *Sm4CL-like7* and *SmTAT1* each expressed three isoforms (Supplementary Table S5). Among their respective alternative splicing events, *Sm4CL2-iso2*, *Sm4CLlike5-iso2*, and *SmTAT1-iso3* were the dominantly expressed isoforms (Supplementary Table S5), though three isoforms of *Sm4CL-like7* all exhibited similar expression. We found that all of the intron retentions introduced premature termination codons (PTCs), and the PTC locations in *Sm4CL2-iso1*, *Sm4CLlike5-iso1*, *Sm4CL-like7-iso2*, *Sm4CL-like7-iso3*, *SmTAT1-iso1*, and *SmTAT1-iso2* were in intron 4, intron 5, intron 2, intron 3, intron 4, and intron 4, respectively, (Supplementary Figure S2).

# Expression Profiles of Candidate RA Biosynthetic Genes

In this study, analysis of differentially expressed genes in the three root tissues showed that *SmPAL1*, *SmPAL3*, *SmC4H1*, *Sm4CL3*, *Sm4CL-like1*, *Sm4CL-like4*, *SmTAT1*, *SmHPPR3*, *SmRAS,* and *SmCYP98A78* exhibited low expression in the periderm and high expression in the phloem and xylem, in accord with the distribution of LAB (**Figure 1D**). In addition, the transcript levels of *SmPAL1*, *SmC4H1*, *Sm4CL3*, *Sm4CL-like1*, *SmTAT1*, *SmHPPR3*, *SmRAS,* and *SmCYP98A78* were significantly up-regulated after 12 h of MeJA treatment (Supplementary Table S2); however, the expression of *Sm4CL2*, *Sm4CL-like4*, *Sm4CL-like6*, and *SmHPPR2* was down-regulated after MeJA treatment. *SmTAT3*, *SmHCT2*, *SmHCT3*, and *SmHCT4* were identified as silenced genes. *SmHCT1* exhibited remarkably specific expression in the root xylem, yet *SmHCT5* showed only slight expression in the stem. Phylogenetic trees for 18 hydroxycinnamoyltransferase (HCT family) amino acid sequences including hydroxycinnamoyl-CoA:shikimate/quinate

hydroxycinnamoyltransferases (HCS/QTs), RASs, and hydroxycinnamoyl/benzoyltransferases (HCBTs), in different species revealed clustering of five unidentified HCTs from *S. miltiorrhiza* with RASs, rather than with HCS/QTs (Supplementary Figure S3).

#### Co-expression Analysis and Isoform Identification of Candidate CYP450s

Our RNA-seq results showed opposite expression patterns for *SmCYP76AH1* and *SmCYP98A78*, in accord with the different distribution of tanshinones and phenolic acids in the periderm, phloem, and xylem. Moreover, six CYP450s were selected as candidate CYP450s related to RA biosynthesis based on the criteria of a phloem/periderm FPKM greater than 1.5 and a xylem/periderm FPKM greater than 1.5. The selected CYP450s included *SmCYP749A39*, *SmCYP714C2*, *SmCYP92A73*, *SmCYP98A75*, and *SmCYP98A76* (Supplementary Table S3). A comprehensive evaluation of eight RNA-seq and qRT-PCR analyses of these CYP450s, including *SmCYP98A78* and *SmC4H1* (*SmCYP73A120*), indicated that expression level of the candidate CYP450s was significantly up-regulated by MeJA, with the exception of *SmCYP714C2*, which was not expressed in leaves (**Figures 3C** and **4**). Furthermore, Pearson's correlation analysis of the qRT-PCR results showed highly significant co-expression of *SmCYP98A75*, *SmCYP98A76*, *SmCYP98A78*, and *SmC4H1* (*P* < 0.01).

All six CYP450s were identified as full-length transcripts in the PacBio transcriptome, and their gene structures and intron phases are described in **Figures 3A,B**. In addition, gene loci *SmCYP749A39*, *SmCYP98A75*, and *SmCYP98A76* were found to undergo alternative splicing events, with each expressing two gene isoforms. All alternative splicing events of these candidate CYP450s were classified as intron retention. For their respective alternative splicing events, *SmCYP749A39-iso1*,

*SmCYP98A75-iso1*, and *SmCYP98A76-iso1* were found to be the dominantly expressed isoforms (Supplementary Table S5), and intron retention for *SmCYP749A39-iso2*, *SmCYP98A75-iso2*, and *SmCYP98A76-iso2* introduced PTCs in exon 3, intron 2, and intron 2, respectively (Supplementary Figure S4).

To better understand the putative functions of these candidate CYP450s, we constructed a phylogenetic tree with 31 full-length amino acid sequences from various species, including some functionally identified CYP98As. CYP92A73 clustered into one branch with CYP76AH1, which was identified as catalyzing the miltiradiene to ferruginol step in tanshinone biosynthesis (**Figure 3D**). These two CYP450s were found to be neighbors of C4H from *S. miltiorrhiza* and *A. thaliana* (**Figure 3D**). The other predicted CYP450s, SmCYP707A102, SmCYP749A39, and SmCYP714C2, were distant from the CYP98A subfamily (**Figure 3D**).

# Co-expression Analysis and Isoform Identification of Candidate Laccases

Eighty laccases were identified through genome-wide analysis in *S. miltiorrhiza*, with five identified based on the criterion of higher expression in the phloem and xylem than in the periderm: *SmLAC1*, *SmLAC2*, *SmLAC3*, *SmLAC4*, and *SmLAC5* (**Figure 5C**, Supplementary Table S4). Furthermore, RNA-seq and qRT-PCR analyses indicated that MeJA up-regulated the expression of *SmLAC1*, *SmLAC2* and *SmLAC5* (**Figures 5C** and **6**). According to the qRT-PCR data, *SmLAC5* was significantly co-expressed (*P* < 0.05) with *SmCYP98A78* and *SmC4H1*.

All five laccases were identified as full-length transcripts in the PacBio transcriptome, and their gene structures and intron phases are described in **Figures 5A,B**. In addition, the *SmLAC1*, *SmLAC3*, *SmLAC4*, and *SmLAC5* gene loci were found to undergo alternative splicing events, with *SmLAC1*, *SmLAC3*, and *SmLAC5* each expressing two isoforms and *SmLAC4* three isoforms. All of the alternative splicing events in these selected laccases were classified as intron retention. *SmLAC1-iso1*, *SmLAC3-iso2*, *SmLAC4 iso3*, and *SmLAC5-iso1* were the dominantly expressed isoforms among their respective alternative splicing events (Supplementary Table S5), and the intron retention in *SmLAC1-iso2*, *SmLAC3-iso1*, *SmLAC4-iso1*, *SmLAC4-iso2*, and *SmLAC5-iso2* introduced PTCs in intron 2, intron 4, intron 1, intron 4, and intron 5, respectively (Supplementary Figure S5).

To predict the functions of the candidate laccases, a phylogenetic tree was constructed using 32 amino acid sequences from *Populus trichocarpa*, *Picea abies*, *Oryza sativa*, and *S. miltiorrhiza*. SmLAC3, SmLAC4, and SmLAC5 were classified into different branches with other laccases that have been described as closely correlated with lignin biosynthesis in other species (**Figure 5D**).

# DISCUSSION

In this study, we analyzed full-length transcripts and alternative splicing events related to phenolic acid biosynthesis in different root tissue of *S. miltiorrhiza* by combining NGS and TGS technologies. Previous studies have only cloned a small number of full-length genes, such as *SmPAL1*, *SmC4H1*, *SmTAT1*, *SmHPPR1*, *SmRAS*, and *SmCYP98A78*, and identified their functions (Huang et al., 2008; Song and Wang, 2009, 2011; Xiao et al., 2009a, 2011; Di et al., 2013). Despite predicted locations and functions based on genome annotation, other full-length homologous genes and their functions have not yet

been identified (Wang et al., 2015). Among the 28 homologous genes identified as being involved in RA biosynthesis, the ability to detect 68% of full-length transcripts (15 full-length transcripts/22 expressed genes) and 27% of alternative splicing events at gene loci (4/15) indicates a significant advantage of hybrid sequencing in such discovery (Supplementary Table S2). Indeed, the availability of full-length transcripts will allow for establishing a metabolic engineering strategy with the aim of modulating the phenolic acid content, and the identification of alternative splicing events is beneficial for understanding the molecular mechanisms of phenolic acid biosynthesis in *S. miltiorrhiza*.

In line with our interest in phenolic acid biosynthesis, we found that not only the distribution of LAB but also the major expression of phenolic acid biosynthetic genes in the root occurred in the phloem and xylem (**Figure 1**). This agreement between phytochemical assay and gene expression in the root provided a basis for co-expression analysis. In addition, MeJA was found to dramatically promote the accumulation of phenolic acids and the expression of key genes (Xiao et al., 2009b). Although many genes related to RA biosynthesis have been cloned and identified in other species, 28 homologous genes based on genome-wide identification generated more candidates to assist in fully elucidating the RA biosynthetic pathway in *S. miltiorrhiza*. *4CL1* and *4CL2* of phenylpropanoid pathway have been cloned and their functions identified *in vitro* (Zhao et al., 2006); however, the 4CL catalyzing 4-cinnamic acid to 4-coumaroyl-CoA *in vivo* remains unknown. *Sm4CL3*, *Sm4CL-like1,* and *Sm4CL-like4* are most likely to be involved in the synthesis of RA. The overexpression of *SmHPPR1* in tyrosine-derived pathway resulted in the accumulation of 4 hydroxyphenylpyruvic acid, the substrate of SmHPPR (Xiao et al., 2011). However, *SmHPPR3*, rather than *SmHPPR1*, might participate in RA biosynthesis. An additional step from 4 hydroxyohenyllactic acid to 3,4-dihydroxyphenyllactic acid was found in *S. miltiorrhiza* using a C<sup>13</sup> tracer. As this step is likely to be catalyzed by an unknown CYP450 (Di et al., 2013), we then selected six CYP450s that were more significantly expressed

in the phloem and xylem than in the periderm (Supplementary Table S3). According to phylogenetic tree and qRT-PCR analyses, CYP98A75 and CYP98A76 likely participate in RA biosynthesis rather than as 4-coumaroylshikimate/quinate 3-hydroxylases in quinic acid and shikimic acid biosynthesis (Chen et al., 2014). A previous study proposed that a laccase was potentially involved in the oxidative dimerization of RA to synthesize LAB (Di et al., 2013). To explore the reactions that convert RA to LAB, five laccases were identified as exhibiting higher expression in the phloem and xylem than in the periderm (Supplementary Table S4). Furthermore, *SmLAC5* was considered to be the best candidate for LAB synthesis. Further studies of these candidate CYPs and laccases using transgenic methods and biochemical reactions may accurately elucidate the mechanism of phenolic acid biosynthesis.

The complexity of alternative splicing events plays a potentially important regulatory role in SA biosynthesis, and many studies focusing on alternative splicing events in *Arabidopsis* have been reported (Filichkin et al., 2010; Marquez et al., 2012). In contrast to humans, the most common type of alternative splicing event in plants appears to be intron retention (Au et al., 2013; Reddy et al., 2013). In fact, all of the identified alternative splicing events in *S. miltiorrhiza* SA biosynthesis showed intron retention. A recent study reported that most of the intron retention isoforms in *Arabidopsis* are predicted to be targets of nonsense-mediated decay (NMD) to regulate mRNA stability. Expect for *Sm4CL-like7*, the alternative splicing events related to SA biosynthesis in *S. miltiorrhiza* expressed one predominant isoform (Supplementary Table S5). Low-expression isoforms have been described as splicing errors, which commonly trigger NMD to maintain mRNA stability (Filichkin et al., 2010; Zhang et al., 2010; Marquez et al., 2012; Reddy et al., 2013; Walters et al., 2013; Shen et al., 2014). In addition, the highly expressed genes *Sm4CL-like7-iso2* and *Sm4CL-like7-iso3*, which contain PTCs downstream of the splice junctions, might be subjected to NMD to eliminate incomplete transcripts (Supplementary Figure S2). Another prediction about these two PTC isoforms of *Sm4CL-like7* is that small interfering peptides with absent functional domains could form nonfunctional dimers that compete with and negatively regulate functional proteins. Our results clearly detected and predicted alternative splicing events related to SA biosynthesis, though the actual functions of the alternative splicing isoforms remain unknown. Thus, the systematic identification of co-expression, full-length transcripts and alternative splicing events related to the biosynthesis of lipophilic diterpenoid pigments (Xu et al., 2015) and hydrophilic phenolic acids in various root tissues of *S. miltiorrhiza* could better resolve the biology of the synthesis of such natural products.

In summary, we localized SA metabolism in the medicinal plant *S. miltiorrhiza* to the root phloem and xylem. We then identified full-length transcripts, encoding isoforms as well as alternative splicing events in SA biosynthesis and systematically analyzed six candidate CYP450s and five candidate laccases related to SA biosynthesis in *S. miltiorrhiza* using hybrid sequencing. Furthermore, our study provides a model for analyzing the full-length transcriptome and the biosynthesis of active constituents in other medicinal plants.

#### AUTHOR CONTRIBUTIONS

ZX, JS, and SC designed and coordinated the study. ZX, HL, XZ, and AJ performed experiments and analyzed the data. ZX, JS, HL, and SC wrote the manuscript.

#### ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation of China (Grant no. 81573398), the Major Scientific

#### REFERENCES


and Technological Special Project for 'Significant New Drugs Creation' (Grant no. 2014ZX09304307001), the Program for Innovative Research Team at the Institute of Medicinal Plant Development (Grant no. IT1304), and the National Sciencetechnology Support Plan of China (Grant no. 2012BAI29B01).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.00100


experiments with TopHat and Cufflinks. *Nat. Protoc.* 7, 562–578. doi: 10.1038/nprot.2012.016


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Xu, Luo, Ji, Zhang, Song and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Efficient Rutin and Quercetin Biosynthesis through Flavonoids-Related Gene Expression in *Fagopyrum tataricum* Gaertn. Hairy Root Cultures with UV-B Irradiation

*Xuan Huang†, Jingwen Yao†, Yangyang Zhao, Dengfeng Xie, Xue Jiang and Ziqin Xu\**

*Provincial Key Laboratory of Biotechnology of Shaanxi, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Science, Northwest University, Xi'an, China*

#### *Edited by:*

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Heiko Rischer, VTT Technical Research Centre of Finland, Finland Silvia Massa, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

*\*Correspondence:*

*Ziqin Xu ziqinxu@nwu.edu.cn †These authors are co-first authors.*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 18 September 2015 Accepted: 14 January 2016 Published: 04 February 2016*

#### *Citation:*

*Huang X, Yao J, Zhao Y, Xie D, Jiang X and Xu Z (2016) Efficient Rutin and Quercetin Biosynthesis through Flavonoids-Related Gene Expression in Fagopyrum tataricum Gaertn. Hairy Root Cultures with UV-B Irradiation. Front. Plant Sci. 7:63. doi: 10.3389/fpls.2016.00063*

Transformed hairy roots had been efficiently induced from the seedlings of *Fagopyrum tataricum* Gaertn. due to the infection of *Agrobacterium rhizogenes*. Hairy roots were able to display active elongation with high root branching in 1/2 MS medium without growth regulators. The stable introduction of *rol*B and *aux*1 genes of *A. rhizogenes* WT strain 15834 into *F. tataricum* plants was confirmed by PCR analysis. Besides, the absence of *vir*D gene confirmed hairy root was bacteria-free. After six different media and different sources of concentration were tested, the culturing of TB7 hairy root line in 1/2 MS liquid medium supplemented with 30 g l−<sup>1</sup> sucrose for 20 days resulted in a maximal biomass accumulation (13.5 g l−<sup>1</sup> fresh weight, 1.78 g l−<sup>1</sup> dry weight) and rutin content (0.85 mg g−1). The suspension culture of hairy roots led to a 45-fold biomass increase and a 4.11-fold rutin content increase in comparison with the suspension culture of non-transformed roots. The transformation frequency was enhanced through preculturing for 2 days followed by infection for 20 min. The UV-B stress treatment of hairy roots resulted in a striking increase of rutin and quercetin production. Furthermore, the hairy root lines of TB3, TB7, and TB28 were chosen to study the specific effects of UV-B on flavonoid accumulation and flavonoid biosynthetic gene expression by qRT-PCR. This study has demonstrated that the UV-B radiation was an effective elicitor that dramatically changed in the transcript abundance of *ftpAL*, *FtCHI*, *FtCHS*, *FtF3H,* and *FtFLS-1* in *F. tataricum* hairy roots.

Keywords: tartary buckwheat, genetic transformation, hairy root, rutin, UV-B, flavonoid biosynthetic genes

# INTRODUCTION

As a significant food and medicinal species, tartary buckwheat (*Fagopyrum tataricum* Gaertn.; family Polygonaceae) is grown and used in the mountainous regions of Southwest China (Sichuan province), Northern India, Bhutan, and Nepal (Fabjan et al., 2003). The plant contains numerous functional components, including flavonoids, phenolic compounds, phytosterols, fagopyrins, d-chiro-inositol, and thiamin-binding proteins, which play essential role in antioxidant, hypocholesterolemic, and antidiabetic effects (Krkošková and Mrázová, 2005; Jiang et al., 2007; Tomotake et al., 2007; Yang and Ren, 2008; Yao et al., 2008; Qin et al., 2013).

The major functional components of *F. tartaricum*, such as rutin, quercetin, orientin, vitexin and kaempferol, had been demonstrated to be flavonoids In comparison with *Fagopyrum esculentum* (common buckwheat), as a source of dietary rutin and quercetin, *F. tartaricum* had higher contents of flavonoids and other phenolic compounds. As a secondary plant metabolite, rutin blocked the increase of capillary fragility related to hemorrhagic disease, reduced high blood pressure (Abeywardena and Head, 2001), decreased blood vessel permeability (with consequent antiedemic effect), lowered the risk of arteriosclerosis (Wojcicki et al., 1995), and displayed antioxidant activity (Watanabe, 1998; Park et al., 2000; Holasova et al., 2001; Krkošková and Mrázová, 2005). In comparison with common buckwheat, rutin content in tartary buckwheat was ∼3.2-fold higher in flowers, ∼3.1-fold higher in stems and ∼65-fold higher in seeds (Park et al., 2004). There had been increasing researches focusing on tartary buckwheat in recent years due to its remarkable health benefits associated with health.

Flavonoids are a class of secondary metabolites in plants involved in a great number of significant functions. They constitute a relatively diverse group of aromatic compounds derived from phenylalanine and malonyl-coenzyme. Phenylalanine ammonia lyase (PAL) catalyzes the conversion of phenylalanine to cinnamate. Based on this, *trans*-cinnamate is hydroxylated by cinnamic-4-hydroxylases (C4H) and is finally activated by the 4-coumarate/cinnamate coenzyme and 4-coumaryl-CoA-ligase (4CL), for the condensation of malonyl-CoA. As the major intermediates of flavonoid biosynthetic pathways chalcones are produced by the condensation of three molecules of malonyl-CoA and a single molecule of 4-coumaryl-CoA. The condensation of 4-coumaroyl-CoA and malonyl-CoA is conducted by chalcone synthase (CHS) to form either tetrahydroxy chalcone or trihydroxy chalcone. Chalcones are converted to the (2S)-flavanone naringenin by chalcone isomerases (CHIs) in a ring-closing step that forms the heterocyclic C-ring. From these central intermediates, the pathway diverges into several side branches for the synthesis of various classes of flavonoid molecules that are produced through the combined actions of functionalizing enzymes that could hydroxylate, reduce, alkylate, oxidize, and glycosylate the phenylpropanoid core structure (Kaneko et al., 2003; Fowler and Koffas, 2009; Santos et al., 2011); Flavanone 3-hydroxylase (F3H) catalyzes the stereospecific 3-hydroxylation of (2S) flavanones to dihydroflavonols. Furthermore, dihydroflavonols are converted to flavonols and their glycosides through the corresponding flavonoid 3 -hydroxylase (F3 H) and flavonol synthase (FLS). These intermediates are further modified by varieties of hydroxylases, methyltransferases, reductases, and glycosyltransferases to form diverse flavonoids (e.g., quercetin and rutin) and isoflavonoids. For the biosynthesis of anthocyanins, dihydroflavonol reductase (DFR) catalyzes the stereospecific conversion of dihydroflavonols into the respective flavan-3,4-diols (leucoanthocyanins) through NADPHdependent reduction at the 4-carbonyl. The leucoanthocyanins are further converted to the anthocyanidins by anthocyanidin synthase (ANS; Winkel-Shirley, 2001; Pandey and Sohng, 2013).

Hairy root cultures, established by the transformation of plants with *Agrobacterium rhizogenes*, typically showed an increased production of secondary metabolites. They were genetically and biochemically stable at rapid growth rate. Besides, they could synthesize useful natural compounds at the levels comparable to those of wild-type (WT) roots (Guillon et al., 2006a,b). Hairy root cultures of many plant species have already been widely studied regarding the production of secondary metabolites which could be used as pharmaceuticals, cosmetics, and food additives (Crane et al., 2006; Georgiev et al., 2007; Thiruvengadam et al., 2014). Biotechnological approaches which used hairy root culture have greatly enhanced the production of rutin by common buckwheat (Lee et al., 2007; Kim et al., 2010) and the production of phenolic compounds by tartary buckwheat (Kim et al., 2009; Thwe et al., 2013).

UV-B radiation has already been proved to be an efficient biotic stress to stimulate secondary metabolite accumulation in plant cell and tissue culture (Binder et al., 2009; Hao et al., 2009). According to previous work, the flavonoids of *F. esculentum* sprouts were produced as protective substances against the UV-B radiation (Ožbolt et al., 2008; Tsurunaga et al., 2013). To the best of our knowledge, there was no previous report about the effect of UV-B on functional metabolites accumulation in the hairy root culture of *F. tartaricum*.

To elucidate the role of UV-B as an abiotic elicitor in regulating synthesis and a yield of flavonoids and other secondary metabolites, an efficient protocol is needed for stable genetic transformation of hairy roots of *F. tartaricum*. Therefore, we established such a protocol and applied it into the study on induced biosynthesis and accumulation of flavonoids in hairy roots of *F. tartaricum*. Furthermore, we carried out a research program in order to investigate the effects of UV-B light and an addition of various concentrations of sucrose to liquid culture medium so as to enhance the flavonoids production in this study. Moreover, the expression of flavonoid biosynthetic genes was examined through quantitative real time PCR in combination with the change of flavonoids content, in order to analyze the relation between genes expression and metabolic biosynthesis of flavonoids.

#### MATERIALS AND METHODS

#### Plant Material and Cultivation

Dehulled seeds of *F. tataricum* G. were surface-sterilized with 70% (v/v) ethanol for 1 min and 0.1% (v/v) mercuric chloride for 10 min, and then rinsed for four times in sterilized water. The treated seeds were sowed onto 1/2 MS medium (Murashige and Skoog, 1962) and solidified with 0.8% (w/v) agar. Before agar addition, the medium was adjusted to pH 5.8 and then sterilized through autoclaving at 121◦C for 20 min.

Germinating seeds cultured the temperature of 25 ± 2◦C in a growth chamber under a 16-h photoperiod with the flux rate of 35 μmol s−<sup>1</sup> m−2. After 7 days, the hypocotyls and cotyledons of seedlings were cut into 0.5 cm × 0.5 cm pieces on a clean bench, and then transferred into Petri dishes, each of which contained 20 ml MS medium. The cut explants were cultured under the same conditions for 1–3 days as preculture before the inoculation.

Wild tartary buckwheat was planted in a test field at Northwest University (Xi'an, China) during the summers of year 2012 and 2013.

#### Preparation of *Agrobacterium rhizogenes*

Cultured *A. rhizogenes* WT strain 15834 was utilized for hairy root induction. The bacteria were started from glycerol stock and grown at 28◦C on 1.5% (w/v) of agar solidified YEB medium (Van Larebeke et al., 1977) with 250 mg/l penicillin for one night. Single colonies were grown at 28◦C with shaking (180 rpm) in 20 ml YEB liquid medium with 250 mg/L penicillin for selection. *A. rhizogenes* suspension culture was kept overnight until reaching the density of OD600 = 0.5. Cells were collected by centrifugation (4000 rpm, 5 min) and were resuspended in 1/2 MS liquid medium with supplement 30 g/L sucrose and acetosyringone 200 mmol/L. Cell suspensions at density OD600 = 0.6 were used for inoculation.

#### *A. rhizogenes* -Mediated Transformation

Hypocotyls and cotyledons from 7-days-old plants were cut into ∼0.5 cm pieces. Excised explants were dipped into *A. rhizogenes* 15834 suspensions in liquid inoculation medium for 10, 12, 15, or 20 min, blotted dry on sterile filter paper, and incubated in the dark at 25◦C on the agar-solidified MS medium. As a control, a few explants were placed in 1/2 MS liquid medium and were cultured in the same way. After 1∼3 days of coculture, explants were transferred onto solidified MS medium supplemented with 500 mg l−<sup>1</sup> cefotaxime sodium. Explants that produced hairy roots (usually within 2 weeks after infection) were selected for further study. Roots (length 1.5–2.0 cm) that developed on the explants were excised aseptically, transferred onto MS medium supplemented with 400 mg l−<sup>1</sup> cefotaxime in 9-cm Petri dishes, and incubated under the conditions described in Section "Plant Material and Cultivation." Roots were grown for 14 days. Furthermore, 0.3 g fresh weight (FW) was transferred into 250-ml Erlenmeyer flasks containing 50 ml MS, 1/2 MS, N6, <sup>1</sup> */*<sup>2</sup> N6, B5 and 1/2 B5 liquid medium without growth regulators. Cultures were incubated on a shaker (100 rpm) as above, and roots were subcultured in every 14 days. Cefotaxime concentration was gradually reduced to zero in the MS liquid medium. Roots were kept at 25 ± 2◦C under standard cool white fluorescent tubes with the flux rate of 35 μmol s−<sup>1</sup> m−<sup>2</sup> and a 16-h photoperiod. Experiments were conducted in duplicate with three flasks per culture condition. As hairy roots can grew very well on medium without growth regulators, normal roots could hardly grow on the same medium. Therefore, as a control, roots excised from *in vitro* germinated seedlings were cultured in MS liquid medium without growth regulators. Meanwhile, transformation efficiency was calculated with each treatment. The transformation efficiency equaled to the number of explants inducing hairy roots/total number of explants × 100%.

# Genomic DNA Extraction and PCR Analysis

Genomic DNA was extracted from hairy roots and WT (seedling grew in 1/2 MS medium as control) roots of *F. tataricum* by the CTAB procedure (Doyle and Doyle, 1990). The Ri-plasmid of *A. rhizogenes* was extracted from strain 15834 by the SDS/alkaline lysis method (Petit et al., 1983), being used as a positive control. Integration of T-DNA responsible for hairy root formation was confirmed by PCR analysis using *rol*B, *aux*1, and *Vir*D specific primers. The sequences of primers used in the experiment were for *rolB* (Forward: 5 -GAT ATA TGC CAA ATT TAC ACT AG-3 ; Reverse: 5 -GTT AAC AAA GTA GGA AAC AGG-3 , the expected PCR product was 564 bp), *aux*1 (Forward: 5 -TTC GAA GGA AGC TTG TCA GAA-3 ; Reverse: 5 -CTT AAA TCC GTG TGA CCA TAG-3 , the expected PCR product was 350 bp) and *Vir*D (Forward: 5 -ATG TCG CAA GGC AGT AAG CCC A-3 ; Reverse: 5 -GCA GTC TTT CAG CAG GAC GAG CAA-3 , the expected PCR product was 438 bp). The PCR mixture consisted of 5 μl 10× PCR buffer (Takara Biotech; Japan), 2.5 μl of 100 nM dNTPs (Takara), 1 μl primer, 1 μl Taq polymerase (Takara), and 1 μl plant genomic DNA (or 1 μl plasmid DNA), in a final volume of 50 μl. The amplification conditions were: predenature for 5 min at 94◦C; denature for 1 min at 94◦C; anneal primer for 55 s at 52◦C (*rol*B)/ anneal primer for 55 s at 55◦C (*aux*1)/ anneal primer for 55 s at 56◦C (*Vir*D); extension for 1 min at 72◦C; repeat for 30 cycles; and final extension at 72◦C for 10 min. PCR results were checked by agarose gel electrophoresis (100 v/h) with DL5000 ladder marker (Takara), detected by ethidium bromide staining, and photographed by a gel documentation system (Bio-Rad; Hercules, CA, USA).

#### Total RNA Extraction and Expression Analysis of Flavonoid Biosythetic Genes by qRT-PCR

Total RNA was isolated from *F. tataricum* wild and transgenic hairy roots by utilizing the RNeasy Plant Mini Kit (Qiagen; Valencia, CA, USA). The RNA integrity was checked by 1.2% ethidium bromide stained RNA gel through the absorbance spectrum at 260: 280 nm wavelength by NanoVue Plus Spectrophotometer (GE Healthcare Bio-Science Crop., USA). The cDNA was synthesized from 1 μg of DNA free total RNA and reverse transcribed utilizing PrimeScript<sup>R</sup> 1st Strand cDNA Synthesis Kit (Takara). The resulting cDNA products were used as the template for real time-PCR analysis.

Quantitative real-time PCR was performed for the transcriptional level analysis of flavonoid biosynthesis genes in a BIO-RAD CFX96 Real-time PCR system (Bio-Rad Laboratories, Hercules, CA, USA). The gene-specific primer sets were designed as previous information described by Li et al. (2010). Real-time PCR was carried out in a 20 μl reaction volume including 0.5 μl of each primer, 5 μl of template cDNA and 10 μl of SYBR Green (SYBR<sup>R</sup> *premix Ex Taq*TM, Takara). According to Thwe et al. (2013), the program was executed. The histone H3 gene was used as reference gene (Timotijevic et al., 2010). Fluorescent intensity data were acquired during the extension step. The transcript levels were checked through utilizing a standard curve. Identical PCR conditions were used for all targets. The significant differences between cultivars were evaluated from three replicates of each sample.

# Measurement of Flavonoid Content (Rutin and Quercetin) by High-Performance Liquid Chromatography (HPLC)

Harvested hairy roots (1 g) were frozen in liquid N2, ground to a fine powder with a mortar and pestle, and extracted twice in methanol (50 ml) for 24 h at 5◦C. Extracts were vacuum-dried at 80◦C and dissolved in 10 ml methanol. The solution was filtered through a poly filter (pore size 0.45 μm) and diluted twofold with methanol. Extracts were used to analyze the rutin and quercetin by HPLC (Waters 2695 Sespartions Module and Waters 2996 Photodiode Array Detector) on a C18-column (Hypersil ODS, 250 mm × 4.6 mm) at 30◦C. The mobile phase consisted of methanol (solvent A) and 1% (v/v) glacial acetic acid (solvent B) with the flow rate of 1 mL min<sup>−</sup>1. The solvent gradient was from 40% solvent A/60% solvent B to 65% solvent A/35% solvent B over 35 min. Sample (rutin and quercetin) detection wavelength were 257 and 370 nm, respectively; injection volume equaled to 20 μl. The rutin and quercetin were detected and quantified with the authentic standards obtained from the Institute for Identification of Pharmaceutical and Biological Products (Beijing, China), which identified by the comparison with retention times and spectral characteristics of authentic standards. The quantitative values were calculated from the calibration curves (Supplementary Material, S1 and S2). All samples were run in triplicate (A HPLC chromatogram of hairy root line is shown in Supplementary Material, S3).

## Growth Kinetics of Cultured Transformed Roots

Hairy roots (0.3 g FW) were inoculated in a 250-ml Erlenmeyer flasks containing 50 ml basal liquid medium supplemented with sucrose. Biomass accumulation and flavonoids production were optimized through evaluating growth kinetics at various time intervals (4, 8, 12, 16, 20, 24 days). The media (MS, 1/2 MS, N6, 1/2 N6, B5, and 1/2 B5) and various concentrations of sucrose (10, 20, 30, 40, 50 g l<sup>−</sup>1) were evaluated to maximize the root biomass growth. Flasks were cultured with shaking (100 rpm) at 25 <sup>±</sup> <sup>2</sup>◦C with the flux rate of 35 <sup>μ</sup>mol s−<sup>1</sup> <sup>m</sup>−<sup>2</sup> and a 16-h photoperiod. Hairy root biomass (FW and DW) and flavonoids production were measured during the 24 days of culture.

#### UV-B Light Stress Treatment

After hairy roots had grown to 2 g (FW) in 1/2 MS liquid medium, they were exposed to UV light (UV-B 313 lamps; Q-Panel; Cleveland, OH, USA) for 30 min. The lamps were wrapped in cellulose diacetate filters to block the UV-C range (wavelengths *<* 280 nm). The maximal radiation peak was at wavelength 302 nm. The UV-B light intensity on the sample surface was 1.26 μW/cm2, and the total energy supply was 0.34 J/cm2. The treatment continued for 3 days and the experiments of UV-B treatment were repeated at least three times. Both of rutin and quercetin contents were analyzed as in Section "Measurement of Flavonoid Content (Rutin and Quercetin) by High-Performance Liquid Chromatography (HPLC)." Nontransformed roots and wild plants were subjected to UV-B light stress as controls. All parts of the wild plants were examined by HPLC. Meanwhile, the expression of flavoniod biosynthesis genes was detected after the hairy roots were treated by UV-B irradiation. The changes of transcription abundance were analyzed by qRT-PCR as shown in Section "Total RNA Extraction and Expression Analysis of Flavonoid Biosythetic Genes by qRT-PCR."

## Statistical Analysis

All data analyses were performed using the Origin software program, V. 8.0. Values were expressed as mean ± SE. One-way ANOVA was applied in statistical analysis. Differences between means were evaluated by Duncan's multiple range test, and were considered to be significant for *P <* 0.05.

# RESULTS

# Establishment of Hairy Root Induction

The establishment of an efficient and reliable transformation system was proved to be highly desirable for the whole culture process of hairy root. Base on the preliminary experiments, explants age and co-cultivation time were investigated systematically in this work. Finally, 168 explants of *F. tatarium* were induced to hairy roots in 847 explants, and the transformation frequency was 19.83%.

# Effects of Preculture Time on Transformation Frequency of *F. tatarium*

Transformation frequency varied based on different preculture time (**Figure 1A**). Explants became more sensitive to the integration of T-DNA as preculture time increased. Transformation frequencies of explants were improved by 1–2 days preculture, in comparison with controls, reaching the maximum degree after 2 days of preculture. Besides, the induce rate could reach to 10.6 ± 0.6%. At longer preculture times, transformation frequencies decreased rapidly. Five to twelve hairy roots per stem were induced from hypocotyl cut ends or petiole within 2 weeks (**Figure 1B**). Meanwhile, no hairy roots were observed on explant without infection. Two to five hairy roots per explant were induced from cotyledons of tartary buckwheat with *A. rhizogenes* strain 15834 (**Figure 1C**).

# Effect of *A. rhizogenes* Infection Time on

Transformation Frequency Subjected to various transformation times, excised explants were dipped into *A. rhizogenes* in liquid inoculation medium. The optimal transformation times were about 20 min for explants, and the induce rate was 14.3 <sup>±</sup> 0.5% (**Figure 1D**); All these

resulted in greater transformation efficiency and less damages to the explants.

## Confirmation of Genetic Transformation of Hairy Roots by PCR

In order to evaluate the genetic status of the selected hairy root, the PCR-based analysis was performed on the targeted *rol*B, *aux*1, and *vir*D genes. The *rol*B gene (located at the pRi TL-DNA segment) and the *aux*1 gene (located at the pRi TR-DNA segment), were diagnostic for T-DNA integration into the host genome of hairy root. The *Vir*D gene (located outside the pRi T-DNA segment) was used to check the presence of any remaining *Agrobacteria* in hairy root of *F. tatarium.* Furthermore, the root of aseptic plantlets and *A. rhizogenes* 15834 Ri plasmid were used as negative and positive controls respectively. According to the part results shown in **Figure 2**, the coexistence of *rol*B and *aux*1 genes indicated that the established hairy root lines integrated the pRi T-DNA of *A. rhizogenes* 15834 successfully. Furthermore, the absence of *vir*D gene confirmed that all those hairy roots were bacteria-free (The rest of results are shown in Supplementary Material, S4).

From the entire obtained 356 hairy root lines, we chose 60 lines which grew quickly in medium without plant growth regulators to the processing of PCR analysis. The presence of both of *rol*B and *aux*1 genes and the absence of *vir*D gene were observed in 20 hairy root lines. The results showed that these cultures had no *Agrobacterium* contamination, and that both of T-DNA fragments were integrated into the genome of these root lines.

## Effects of Various Media and Sucrose Concentration on Biomass Accumulation

We used several media (MS, 1/2 MS, B5, 1/2 B5, N6, 1/2 N6) for hairy root culture (**Table 1**). Hairy roots (TB7 line) were cultured for 20 days in six media for observation and for accurate measurement of FW and DW. Amongst the six media, 1/2 MS was the best for biomass accumulation (**Table 1**). Hairy roots were cultured in 1/2 MS liquid medium and presented the most rapid proliferation (**Figure 3**). The highest values of biomass accumulation (13.5 g l−<sup>1</sup> FW; 1.78 g l−<sup>1</sup> DW) were recorded for 1/2 MS medium, in which a thick mass grew in the bottle within 20 days. The growth cycles were generally similar in 1/2 MS solid vs. liquid media. However, the hairy root growth was slower in the solid medium. The condition of biomass accumulation in B5 and N6 media (including 1/2 B5 and 1/2 N6) was similar, which FW and DW of biomass were ∼50% lower than in 1/2 MS liquid medium, and the rutin content was lower than in 1/2 MS liquid medium.

As hairy roots were grew very well in 1/2 MS and MS media, we established various concentrations of sucrose (10–50 g l<sup>−</sup>1) in 1/2 MS medium, and examined their effects on hairy root growth. As the concentration of sucrose increased, hairy roots changed from white to brown, and became slower-growing. The

growth of hairy root reached its maximum in 30 g l−<sup>1</sup> sucrose, and decreased greatly at the concentrations higher or lower than this value.

#### Kinetics of *F. tataricum* Hairy Root Growth and Flavonoids Accumulation

We detected flavonoids yield of 20 lines of hairy root which had been confirmed by PCR analysis. The result indicated that line TB7 had the highest content of rutin and quercetin (Data

TABLE 1 | Effects of different media and sucrose concentrations on biomass production of hairy root cultures in 1/2 MS medium.


*Roots (0.3 g FW per flask) were cultured for 20 days in 250-ml Erlenmeyer flasks containing 50 ml medium for 20 days. Data shown are mean* ± *SE of three replicates. Each experiment was performed in triplicate. Means with common letters are not significantly different at P < 0.05 according to Duncan's multiple range test.*

isn't shown). Therefore, TB7 were chosen to investigate the kinetics studies on biomass growth and flavonids accumulation in hairy root cultures of *F. tataricum* as shown in **Figure 4**. Hairy roots (initial FW 0.3 g) were cultured for 30 days in 1/2 MS medium containing 30 g l−<sup>1</sup> sucrose. Root growth occurred primarily during the first 4–5 days, and then leveled off during days 20–24. Hairy roots were initially white or pale yellow, yet became brown and slow-growing after day 24, and required so subculturing in every 24 days. The maximal rutin yield was obtained (0.85 mg g<sup>−</sup>1) when the hairy root cultures were cultured on days 20. At the same time, the maximal FW and DW were 13.5 and 1.78 g, respectively. Both of biomass and rutin contents declined rapidly after 20 days.

## Effect of UV-B Light Stress Treatment on Flavonoids Production

We evaluated the resistance of wild tartary buckwheat plants and hairy roots to the UV-B light stress, in regard to possible enhancement of rutin and quercetin production (**Table 2**). TB7 hairy root lines were chosen to study the specific effects of UV-B on flavonoid accumulation. The rutin and quercetin content of stems, leaves, flowers, and WT (non-transformed) roots were compared with that of hairy roots. Following the UV-B stress treatment, the rutin content of hairy roots was strikingly higher than that in non-transformed roots. The increase in rutin content of treated hairy roots (from 0.93 to 4.82 mg g<sup>−</sup>1) was 5.18-fold higher than in WT roots. Hairy roots were more sensitive to UV-B stress treatment than to non-transformed roots, flowers, or stems. The relative order of rutin content increase under UV-B stress was leaves (9.35-fold) *>* hairy roots (5.18-fold) *>* stems (3.57-fold) *>* non-transformed roots (2.95-fold) *>* flowers (2.66 fold). Quercetin yield could not be detected in no-transformed root before the UV-B treatment. However, quercetin content was found in hairy root TB7 line, which increased drastically in hairy root with exposure due to UV-B (from 0.02 to 0.04 mg g−1). Meanwhile, quercetin was detected in leaves, flowers and stems of *F. tataricum*, and the yield was increased substantially after the treated UV-B irradiation.

#### Expression of Flavonoid Biosynthetic Genes in Hairy Roots of *F. tataricum* with UV-B Irradiation

To investigate biosynthesis of flavonoid genes in *F. tataricum*, the expression levels of biosynthesis genes in the hairy roots of *F. tataricum* were examined by qRT-PCR (**Figure 5A**). The expression levels of *ftpAL*, *FtC4H*, *Ft4CL*, *FtCHS*, *FtCHI*, *FtF3H*, *FtF3 H-1*, *FtF3 H-2*, *FtFLS-1*, *FtFLS-2*, *FtDFR*, and *FtANS* were shown. Although the gene transcripts for all of these enzymes were expressed in hairy roots (TB3, TB7, and TB28 lines) of *F. tataricum,* the expression levels were upregulated in TB7 than TB3 and TB28 hairy root lines except in *FtF3 H-1*, *FtF3 H-2*, *FtFLS-2,* and *FtANS*. In particular expression levels of *ftpAL, FtC4H*, *FtCHI*, *FtF3H,* and *FtFLS-1* in TB7 hairy root were significantly higher than the expression levels of TB3 and TB28 lines.

In order to understand the role of UV-B in regulating flavonoid biosynthesis, the transcript abundance for genes were involved in the flavonoid biosynthetic pathway and were analyzed by qRT-PCR (**Figure 5B**). The hairy root lines of TB3, TB7, and TB28 were also chosen to study the specific effects of UV-B exposure on gene expression. Although the gene transcripts for all the genes had increased the expression in three hairy roots lines, the expression levels were upregulated in TB7 than in TB3 and TB28 hairy root lines except in *FtF3 H-1*, *FtF3 H-2*, *FtDFR,* and *FtANS*. The key regulate genes of flavonoid biosynthesis, including *FtpAL*, *FtCHS*, *FtCHI*, *FtF3H*, and *FtFtFLS-1*, were found to be responsive to UV-B exposure drastically. *FtFLS-1* showed the highest transcript abundance in the UV-B exposure treatment, which was 30–40 fold higher than of no UV-B treatment. A significant UV-B induced was also observed in the transcript abundance of *FtCHI and Ft CHS*, which was 20–30 fold higher than no UV-B treatment. In contrast, the transcript abundance of *FtF3 H-1*, *FtF3 H-2*, *FtFLS-2*, *FtDFR,* and *FtANS* were slightly enhanced to response to the UV-B exposure treatments.

As shown in **Figure 5**, genes of flavonoid biosynthesis in tartary buckwheat have been almost entirely elucidated.

# DISCUSSION

*Agrobacterium rhizogenes* as a useful tool for gene transfer has been widely applied in many plant species. *A. rhizogenes* strain 15834 was often used to induce hairy root formation. Kim et al. (2010) obtained the hairy root of *F. esculentum* by *A. rhizogenes* 15834 successfully. *Taraxacum platycarpum* (Lee et al., 2004) and *Panax ginseng* (Yang and Choi, 2000) also underwent successful transformation with *A. rhizogenes* 15834. Meanwhile, Kim et al. (2009) and Park et al. (2011) reported that they obtained hairy root after inoculating sterile young stems of *F. tataricum* with *A. rhizogenes* strain R1000.

Transformation efficiency was affected by bacterial growth stage, infection time, pre-treatment of explant, as well as light and temperature conditions. Preculture of explants in MS medium for 2 days prior to transformation enhanced transformation efficiency. The optimal time of transformation was ∼20 min for explant. Plant tissues might be injured if the infection time was too long, and the infection time varied according to the plant



*After exposed to UV-B light for 30 min for 3 days, rutin and quercetin content of hairy roots were analyzed by HPLC. Data shown are mean* ± *SE of three replicates. Each experiment was performed in triplicate. Means with different letters are significantly different at P < 0.05 according to Duncan's multiple range test.*

species and type of explant. Chen et al. (2008) showed that longer preculture time might reduce explant viability, which resulted in injury. On the other hand, explants might wither more easily in the absence of a preculture process.

We obtained the maximal transformation frequency of *F. tataricum* by *Agrobacterium* only after suitable preculture. However, this conclusion might not be applied to all the plant species. Kim et al. (2004) used *Agrobacterium* to infect the *Perilla frutescens* explants, finding that the transformation frequency was much higher for non-precultured than for precultured explants.

The growth rate of hairy roots was greatly affected by the culture medium and various culture conditions. In present study, culturing in 1/2 MS liquid medium led to the most rapid proliferation of hairy roots, and biomass (FW) had increased ∼45-fold in 20 days. The growth of hairy root was exponential from days 0 to 20, and then entered into a stationary phase during days 20–25. These findings indicated that hairy root liquid cultures of tartary buckwheat were potentially useful for largescale biomass production. Park et al. (2011) also reported that the dry weight of hairy root enhanced 25-fold in the MS liquid medium during 21 days. *Polygonum multiflorum* Thunb. hairy root showed that the biomass (initially 0.5 g FW) increased 9.5 fold after being cultured in hormone-free MS liquid medium for 20 days (Thiruvengadam et al., 2014).

As the major carbon source, sucrose was extremely essential for the *F. tataricum* hairy root growth. In the present study, hairy roots grew rapidly in high sucrose concentrations. The optimal sucrose concentration was found to be 30 g l<sup>−</sup>1, at which biomass accumulation and rutin content were maximal. The growth of hairy root was strikingly lower at sucrose concentrations above or below 30 g l<sup>−</sup>1. Higher concentrations also resulted in the alteration of root morphology (e.g., root calluses, inhibition of lateral branching) probably due to osmotic stress (Hamill et al., 1987). Yu et al. (1996) found that the sucrose level was affected hairy root production in *Solanum avidare*. The levels of secondary metabolites produced by *in vitro* cultures could vary dramatically. As a matter of fact, most previous studies were focusing on nutrient composition in medium to achieve the optimized accumulation of metabolites in cultured cells (Rao and Ravishankar, 2002).

Since the biosynthesis of many secondary metabolites in plant is usually considered as a common defense response of plants to biotic and abiotic stresses, their accumulation could be stimulated by biotic and abiotic elicitors (Zhao et al., 2014a). Therefore, elicitation, as treatment of plant tissue cultures with elicitors, is one of the most effective strategies to enhance secondary metabolites production in plant tissue cultures. The most common and effective elicitors used in previous studies mainly included heavy metal ions, UV radiation (abiotic), the component of microbial cells, especially polyand oligosaccharides (biotic), and the signaling molecules in plant defense responses, such as salicylic acid (SA) and methyl jasmonate (MJ; Chen and Chen, 2000; Broeckling et al., 2005; Prakash and Srivastava, 2008; Smetanska, 2008; Ionkova, 2009; Zhao et al., 2010).

Flavonoids are produced as protective substances against UV-B radiation in plant. As an effective abiotic elicitor, some studies have described the production of flavonoids by buckwheat sprouts in response to UV-B irradiation (Kreft et al., 2002; Eguchi and Sato, 2009). Rutin (sometimes called vitamin P) displays strong antioxidant activity which could alleviate the damage from UV-B stress. Tsurunaga et al. (2013) found that rutin content and radical scavenging activity of buckwheat sprouts were enhanced under various levels of UV-B radiation. In the present study, rutin and quercetin content of hairy roots and all parts of tartary buckwheat were increased under UV-B stress. The maximal increase of rutin content (from 3.19 to 29.79 mg g−1, 9.35-fold) was observed in leaves. Interestingly, the next-highest increase of rutin content (from 0.93 to 4.82 mg g<sup>−</sup>1, 5.18-fold) was observed in hairy roots. This phenomenon might result from the insertion site of T-DNA during transformation; the underlying mechanism requires further investigation. In a previous study of buckwheat, Kim et al. (2010) found that rutin content was ∼2.4 fold higher in hairy roots than in WT roots. These findings are consistent with those of transformation studies on other plants, which suggested that *Agrobacterium* transfection might greatly enhance rutin content (Fu et al., 2006). According previous work, some work indicated that biotic elicitors can also enhance rutin and quercetin production in *F. tataricum* hairy root, e.g., Yeast polysaccharide (Zhao et al., 2014a) and exogenous fungal mycelia (Zhao et al., 2014b).

To understand the role of UV-B in regulating flavonoid biosynthesis, the transcript abundance for key enzymes genes involved in the flavonoid biosynthetic pathway were analyzed by qRT-PCR. Previous studies have extensively described the UV-B induction of those gene expressions (Frohnmeyer et al., 1992; Kubasek et al., 1992; Christie and Jenkins, 1996; Kliebenstein et al., 2002; Stracke et al., 2010). Flavonoid synthesis in plants is induced by perceiving the UV-B light with photoreceptors and by the expression of phenylpropanoid biosynthesis genes (Rizzini et al., 2011). These works considered the useage of flavonoid biosynthetic key enzymes PAL.C4H, CHI, CHS, FLC, F3H, etc. as an excellent UV-B signaling response marker. Amongst these genes, activated by the UV light, PAL played an important role in the first step of the flavonoid biosynthesis pathway. According to Hahlbrock et al. (2003), the promoter on PAL was in Box-P and that box-P-binding factor 1 was induced by UV stimulation and up-regulated the gene expression of PAL and followed the binding to Box-P. In the UV irradiation of rice seedlings, it has been reported that PAL increased at 12 h to its maximum expression and increases the contents of flavonoids (Jin et al., 2000). Hao et al. (2009) also reported that activation of PAL in *Ginkgo biloba* callus could be induced by UV-B, and flavonid biosynthesis could be stimulated. In *Arabidopsis*, the UV-B-mediate induction of *CHS* expression was UVR8-, COP1-, and HY-5-dependent with proteins belonging to the UV-B signaling pathway (Kliebenstein et al., 2002; Stracke et al., 2010). It has been widely known that HY5 can directly bind to the *CHS* promoter. However, this was not sufficient for the *CHS* transcriptional activation. However, the overexpression of HY5 fused with an activation domain was sufficient for the *CHS* expression induction, indicating that a presently unknown UV-B activated transcription factor must be involved as well (Stracke et al., 2010). In the present study, an irradiation with UV-B stimulated the expression of *PAL* in hairy roots of tartary buckwheat, showing significant expression after the irradiation (**Figure 5B**). Box-P was also conserved in the promoter region of other phenylpropanoid-biosynthesis genes and a UV responsible gene. For example, AtMYB4 in *Arabidopsis* represses the expression of the *C4H* gene, whereas UV irradiation canceled this repression and induced the expression of *C4H* (Jin et al., 2000). UV-B irradiation could trigger flavonol and anthocyanin biosynthesis in grapevine berries, which proved by up-regulated of key biosynthetic genes (*FLS1* and *UFGT*) and an increased anthocyanin concentration (Martinez-Lüscher et al., 2014). Those studies have provided useful evidence to prove the effect of UV-B exposure on the flavonoid biosynthesis.

This study demonstrated that UV-B as a major component changed dramatically in the transcript abundance of *FtpAL*, *FtCHI*, *FtCHS*, *FtF3H,* and *FtFLS-1* in *F. tataricum* hairy roots. Furthermore, *FtF3 H-1*, *FtF3 H-2,* and *FtFLS-2* genes were enhanced slightly after the UV-B treatment. A single irradiation with UV-B increased the production of rutin and quercetin for a maximum production correlated to the flavonoid biosynthesis enzyme gene expression. Through comparing with previous work, Thwe et al. (2013) investigated the various genes in the phenylpropanoid biosynthetic pathway to analyzed *in vitro* production of anthocyanin and phenolic compounds from hairy root cultures derived from two cultivars of tartary buckwheat. The result showed that phenylpropanoid biosynthetic pathway genes had expression in hairy roots, and the rutin and anthocyanin in hairy root of tartary buckwheat were identified. Thwe's work provides useful information on the molecular and physiological dynamic process that was correlated with phenylpropanoid biosynthetic gene expression and phenolic compound content in *F. tatarium* species. Base on those works, *F. tataricum* could be achieved through a complex regulation of genes involved in the flavonoid biosynthetic pathway.

### CONCLUSION

We have established an efficient protocol for *A. rhizogenes*mediated genetic transformation of tartary buckwheat (*F. tataricum*), and applied PCR analysis in the detection of hairy roots. Hairy roots cultured in 1/2 MS liquid medium supplemented with 30 g l−<sup>1</sup> sucrose grew faster than normal roots under standard liquid culture conditions, and had higher rutin content. We also evaluated the effects of UV-B radiation on hairy roots and all other plant organs. We found that the rutin and quercetin content was significantly higher in hairy roots than in WT (non-transformed) roots. The expression of flavonoid biosynthetic genes was examined through quantitative real time PCR with UV-B treatment. The results showed that the expressions of key regulated genes were increased sharply in flavonoid biosynthetic pathway. Further studies on the dose-dependent UV-B irradiation could be used to determine the possible effects on tartary buckwheat flavonoids and anthocyanins incolved in the signal transduction pathway, which resulted in regulation by UV-B irradiation.

#### AUTHOR CONTRIBUTIONS

XH completed the *A. rhizogenes*-mediated transformation experiment, determination of the hairy roots rutin content and qRT-PCR experiment, JY completed effect of UV-B light stress treatment on rutin production of hairy roots experiment, YZ assisted with the *A. rhizogenes*-mediated transformation experiment, DX assisted with the rutin content determination experiment, and QZ guided the whole research as the corresponding author.

#### ACKNOWLEDGMENTS

This study was supported by grants from the National Natural Science Foundation of China (31300223), National Science Foundation for Fostering Talents in Basic Research of the National Natural Science Foundation of China (J1210063), Natural Science Foundation of Shaanxi Province (2012JQ3003), Specialized Research Fund for the Doctoral Program of Higher Education (20126101120019), Opening Foundation of Key Laboratory of Resource Biology and Biotechnology in Western China (Northwest University), Ministry of Education (ZS12013), Key Scientific Research Project of Provincial Education Department of Shaanxi (15JS110), Provincial Training Programs of Innovation and Entrepreneurship for Undergraduate (1030), and Scientific Research Foundation for Returned Overseas Chinese Scholars, State Education Ministry. In addition to this,

#### REFERENCES


the authors are grateful to Dr. S. Anderson for English editing of the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.00063


hairy root culture of tartary buckwheat cultivars. *PLoS ONE* 8:e65349. doi: 10.1371/journal.pone.0065349


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer (Silvia Massa) and Handling Editor declared their shared affiliation, and the Handling Editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2016 Huang, Yao, Zhao, Xie, Jiang and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *Cannabis sativa*: The Plant of the Thousand and One Molecules

#### *Christelle M. Andre\*, Jean-Francois Hausman and Gea Guerriero*

*Environmental Research and Innovation, Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg*

*Cannabis sativa* L. is an important herbaceous species originating from Central Asia, which has been used in folk medicine and as a source of textile fiber since the dawn of times. This fast-growing plant has recently seen a resurgence of interest because of its multi-purpose applications: it is indeed a treasure trove of phytochemicals and a rich source of both cellulosic and woody fibers. Equally highly interested in this plant are the pharmaceutical and construction sectors, since its metabolites show potent bioactivities on human health and its outer and inner stem tissues can be used to make bioplastics and concrete-like material, respectively. In this review, the rich spectrum of hemp phytochemicals is discussed by putting a special emphasis on molecules of industrial interest, including cannabinoids, terpenes and phenolic compounds, and their biosynthetic routes. Cannabinoids represent the most studied group of compounds, mainly due to their wide range of pharmaceutical effects in humans, including psychotropic activities. The therapeutic and commercial interests of some terpenes and phenolic compounds, and in particular stilbenoids and lignans, are also highlighted in view of the most recent literature data. Biotechnological avenues to enhance the production and bioactivity of hemp secondary metabolites are proposed by discussing the power of plant genetic engineering and tissue culture. In particular two systems are reviewed, i.e., cell suspension and hairy root cultures. Additionally, an entire section is devoted to hemp trichomes, in the light of their importance as phytochemical factories. Ultimately, prospects on the benefits linked to the use of the *-omics* technologies, such as metabolomics and transcriptomics to speed up the identification and the large-scale production of lead agents from bioengineered *Cannabis* cell culture, are presented.

Keywords: fibers, hemp, *Cannabis*, cellulose, lignin, cannabinoids, terpenes, lignans

#### INTRODUCTION

The current climatic and economic scenario pushes toward the use of sustainable resources to reduce our dependence on petrochemicals and to minimize the impact on the environment. Plants are precious natural resources, because they can supply both phytochemicals and lignocellulosic biomass. In this review, we focus on hemp (*Cannabis sativa* L.), since it is a source of fibers, oil and molecules and as such it is an emblematic example of a multi-purpose crop. We treat the aspects related to the use of hemp biomass and, more extensively, those linked to its wide variety of phytochemicals.

#### *Edited by:*

*Eugenio Benvenuto, ENEA, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Biswapriya Biswavas Misra, University of Florida, USA Felix Stehle, Technical University of Dortmund, Germany*

> *\*Correspondence: Christelle M. Andre christelle.andre@list.lu*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 27 October 2015 Accepted: 08 January 2016 Published: 04 February 2016*

#### *Citation:*

*Andre CM, Hausman J-F and Guerriero G (2016) Cannabis sativa: The Plant of the Thousand and One Molecules. Front. Plant Sci. 7:19. doi: 10.3389/fpls.2016.00019*

Known since the ancient times for its medicinal and textile uses (Russo et al., 2008; Skoglund et al., 2013), hemp is currently witnessing a revival, because of its rich repertoire of phytochemicals, its fibers and its agricultural features, namely quite good resistance to drought and pests, well-developed root system preventing soil erosion, lower water requirement with respect to other crops, e.g., cotton. This shows the great versatility of this fiber crop and encourages future studies focused on both *Cannabis* (bio)chemistry and genetic engineering. Hemp varieties producing oil, biomass or even both are currently cultivated and the availability of the hemp genome sequence greatly helps molecular studies on this important crop (van Bakel et al., 2011). In addition, the scientific community is very much interested in harnessing *Cannabis* pharmacological power: for example microorganisms are being engineered to produce -9-tetrahydrocannabinolic acid (THCA) and cannabidiolic acid (CBDA) (Taura et al., 2007a; Zirpel et al., 2015).

The final scope of this review is to discuss the potential of hemp for industry and to highlight its importance for the bio-economy. More specifically, we: (i) describe the use of hemp biomass (i.e., the fibers), (ii) discuss hemp molecules of industrial interest (namely cannabinoids, terpenes and phenolic compounds), (iii) describe the potential of hemp trichomes as pharma-factories and (iv) discuss the potential of genetic engineering, by describing the use of plant cell suspension and hairy root cultures.

# HEMP STEM: A SOURCE OF FIBERS WITH ANTIBACTERIAL PROPERTIES

Plant lignocellulosic biomass is an abundant renewable resource, which can provide biopolymers, fibers, chemicals and energy (Guerriero et al., 2014, 2015, 2016). Trees are important for the provision of wood, however, also fast-growing herbaceous species, like textile hemp (which has a THC content <0.3%; Weiblen et al., 2015), can provide high biomass quantities in a short time. The stem of this fiber crop supplies both cellulosic and woody fibers: the core is indeed lignified, while the cortex harbors long cellulose-rich fibers, known as bast fibers (**Figure 1**) (Guerriero et al., 2013).

This heterogeneous cell wall composition makes hemp stem an interesting model to study secondary cell wall biosynthesis, in particular the molecular events underlying the deposition of cortical gelatinous bast fibers and core woody fibers.

*Cannabis* woody fibers (a.k.a "hurds" or "shivs") are used for animal bedding because of their high absorption capacity and for the creation of a concrete-like material.

Hemp bast fibers are used in the biocomposite sector as a substitute of glass fibers. The automotive industry is particularly keen on using hemp bast fibers to produce bioplastics: this material is stronger than polypropylene plastic and lighter in weight (Marsh, 2003).

Beyond the applications in the construction and automotive industries, hemp fibers are attractive also in the light of their natural antibacterial property. Hemp bast fibers have been indeed described as antibacterial (Hao et al., 2014; Khan et al., 2015) and their use for the manufacture of an antibacterial finishing agent (Bao et al., 2014), surgical devices (Gu, 2006) or functionalized textiles (Cassano et al., 2013) has been reported. This property is linked to the chemical composition of hemp bast fibers: both free and esterified sterols and triterpenes have been identified, among which β-sitosterol and β-amyrin (Gutiérrez and del Río, 2005). These compounds possess known antibacterial properties (Kiprono et al., 2000; Ibrahim, 2012). Hemp bast fibers were also found to contain cannabinoids (2% of the total metabolite extract) (Bouloc et al., 2013 and references therein). More recently hemp hurd powder showed antibacterial properties against *Escherichia coli* (Khan et al., 2015). Since the hurd has a higher lignin content than the bast fibers, its antibacterial property may be linked to lignin-related compounds such phenolic compounds, as well as alkaloids and cannabinoids (Appendino et al., 2008; Khan et al., 2015).

## HEMP PHYTOCHEMICALS: THEIR PRODUCTION PATHWAYS AND MYRIAD OF BIOLOGICAL ACTIVITIES

Numerous chemicals are produced in hemp through the secondary metabolism. They include cannabinoids, terpenes and phenolic compounds (Flores-Sanchez and Verpoorte, 2008) and will be further described in the next sections. Although the pharmacological properties of cannabinoids have extensively been studied and are the most recognized hemp bioactives, the other components have no reasons to envy them, as they have also been associated with potent health-promoting properties. Research on *Cannabis* phytochemicals, as well as the widespread therapeutic use of *Cannabis* products, has been limited due to various reasons, including illegality of cultivation (due to its psychoactivity and potential for inducing dependence), variability of active components, and low abundance of some of them *in planta*. Further attentions is now drawn toward non-THC *Cannabis* active components, which may act synergistically and contribute to the pharmacological power and entourage effects of medicinal-based *Cannabis* extract (Russo, 2011).

#### Phytocannabinoids

Phytocannabinoids represent a group of C21 or C22 (for the carboxylated forms) terpenophenolic compounds predominantly produced in *Cannabis.* They have also been reported in plants from the *Radula* and *Helichrysum* genus (Appendino et al., 2008) but our knowledge on non-*Cannabis* source of cannabinoids is still in its infancy (Gertsch et al., 2010). More than 90 different cannabinoids have been reported in the literature, although some of these are breakdown products (ElSohly and Slade, 2005; Brenneisen, 2007; Radwan et al., 2009; Fischedick et al., 2010) and they are generally classified into 10 subclasses (Brenneisen, 2007). In this review, we will focus on the most abundant compounds found in the drug- and fiber-type *Cannabis*. The predominant compounds are THCA, CBDA and cannabinolic acid (CBNA), followed by cannabigerolic acid (CBGA), cannabichromenic acid (CBCA) and cannabinodiolic acid (CBNDA) (ElSohly and

Slade, 2005). THCA is the major cannabinoid in the drugtype *Cannabis*, while CBDA predominates in fiber-type hemps. CBCA has been reported to dominate in the cannabinoid fraction of young plants and to decline with maturation (Meijer et al., 2009). The phytocannabinoid acids are non-enzymatically decarboxylated into their corresponding neutral forms, which occur both within the plant and, to a much larger extent, upon heating after harvesting (Flores-Sanchez and Verpoorte, 2008). Phytocannabinoids accumulate in the secretory cavity of the glandular trichomes, which largely occur in female flowers and in most aerial parts of the plants, as further described in the next section. They have also been detected in low quantity in other parts of the plants including the seeds (Ross et al., 2000), roots (Stout et al., 2012) and the pollen (Ross et al., 2005), in an extent depending on the drug- or fiber-type of *Cannabis*, as described in **Table 1**. More generally, the concentration of these compounds depends on tissue type (**Table 1**), age, variety, growth conditions (nutrition, humidity, light level), harvest time and storage conditions (Khan et al., 2014). The level of phytocannabinoids in hempseeds, and thereby of hempseed oil, should be very low as the kernel contains only trace amount of THC or CBD (Leizer et al., 2000; Ross et al., 2000). However, higher THC concentrations are found on the outside surface of the seed coat, possibly as the result of contamination with plant leaves or flowers (Ross et al., 2000). Recently, significant amounts of cannabinoids, and particularly of THC, were found in five out of 11 hempseed oil samples available on the Croatian market, suggesting that both contaminations are due to improper processing procedures and the illegal use of drug-type hemp (with a THC + CBN/CBD ratio >1) for nutritional purposes (Petrovic et al., 2015 ´ ). Cannabinoids in the leaves have been shown to decrease with the age and along the stem axis, with the highest levels observed in the leaves of the uppermost nodes (Pacifico et al., 2008). Cannabinoid contents in the stem are scarce in the literature. An analysis performed on the dust obtained from the top section of the stem of fiber-type hemp (low percentage of bast fibers) revealed a low THC and CBD content (0.04 and 1.3% on average, respectively) (Cappelletto et al., 2001). Kortekaas et al. (1995) analyzed the cannabinoid content of hemp black liquor. The sum of the THC and CBD fractions (without reporting the distinct amounts of each of them) in hemp stem wood and bark extractives was 2 and 1%, respectively, which represented 0.003 and 0.0005% of the total fiber content.

## Biosynthetic Pathway Leading to Phytocannabinoids

The biosynthesis of cannabinoids from *C. sativa* has only been recently elucidated. The precursors of cannabinoids actually originate from two distinct biosynthetic pathways: the polyketide pathway, giving rise to olivetolic acid (OLA) and the plastidal 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway, leading to the synthesis of geranyl diphosphate (GPP) (Sirikantaramas et al., 2007) (**Figure 2**). OLA is formed from hexanoyl-CoA, derived from the short-chain fatty acid hexanoate (Stout et al., 2012), by aldol condensation with three molecule of malonyl-CoA. This reaction is catalyzed by a recently discovered polyketide synthase (PKS) enzyme and an olivetolic acid cyclase (OAC) (Gagne et al., 2012). The geranylpyrophosphate:olivetolate geranyltransferase catalyzes the alkylation of OLA with GPP leading to the formation of CBGA, the central precursor of various cannabinoids (Fellermeier and Zenk, 1998). Three oxidocyclases will then be responsible for the diversity of cannabinoids: the THCA synthase (THCAS) converts CBGA to THCA, while CBDA synthase (CBDAS) forms CBDA and CBCA synthase (CBCAS) produces CBCA (Sirikantaramas et al., 2004, 2005; Taura et al., 2007b). Propyl cannabinoids (cannabinoids with a C3 side-chain, instead of a C5 side-chain), such as tetrahydrocannabivarinic acid (THCVA), synthetized from a


divarinolic acid precursor, have also been reported in *Cannabis* (Flores-Sanchez and Verpoorte, 2008).

#### Health Benefits Linked to Cannabinoids

The pharmacology of phytocannabinoids has previously been reviewed elsewhere (Pacher et al., 2006; Russo, 2011; Hill et al., 2012; Giacoppo et al., 2014; Burstein, 2015) and a brief summary and update will be presented hereafter.

Most of the biological properties related to cannabinoids rely on their interactions with the endocannabinoid system in humans. The endocannabinoid system includes two G proteincoupled cannabinoid receptors, CB1 and CB2, as well as two endogenous ligands, anandamide and 2-arachidonylglycerol. Endocannabinoids are thought to modulate or play a regulatory role in a variety of physiological processing including appetite, pain-sensation, mood, memory, inflammation, insulin, sensitivity and fat and energy metabolism (De Petrocellis et al., 2011; Di Marzo and Piscitelli, 2015). The psychoactive decarboxylated form of THCA, THC, is a partial agonist of both CB1 and CB2 receptors, but has higher affinity for the CB1 receptor, which appears to mediate its psychoactive properties. In addition to being present in the central nervous system and throughout the brain, CB1 receptors are also found in the immune cells and the gastrointestinal, reproductive, adrenal, heart, lung and bladder tissues, where cannabinoids can therefore also exert their activities. CB2 receptors are thought to have immunomodulatory effects and to regulate cytokine activity. But THC has actually more molecular targets than just CB1 and CB2 receptors, and exhibit potent anti-inflammatory, anti-cancer, analgesic, muscle relaxant, neuro-antioxidative (De Petrocellis et al., 2011), and antispasmodic activities (Pacher et al., 2006). However, THC has been also associated with a number of side effects, including anxiety, cholinergic deficits, and immunosuppression (Russo, 2011). CBDA is the most prevalent phytocannabinoid in the fiber-type hemp, and the second most important in the drug chemotypes. CBD (decarboxylation of CBDA) presents a large array of pharmacological properties, as recently reviewed in Burstein (2015), which has been downplayed for many years, as compared to THC. CBD acts yet as an important entourage compound as it is able to reduce the side effects of THC (Englund et al., 2012), and may thereby increase the safety of *Cannabis*-based extracts. CBD itself has been shown in *in vitro* and animal studies to possess, among others, anti-anxiety, anti-nausea, anti-arthritic, anti-psychotic, anti-inflammatory, and immunomodulatory properties (Burstein, 2015). CBD is a very promising cannabinoid as it has also shown potential as therapeutic agents in preclinical models of central nervous system diseases such as epilepsy, neurodegenerative diseases, schizophrenia, multiple sclerosis, affective disorders and the central modulation of feeding behavior (Hill et al., 2012). Interestingly, CBD presents also strong anti-fungal and antibacterial properties, and more interestingly powerful activity against methicillin-resistant *Staphylococcus aureus* (MRSA) (Appendino et al., 2008). After THC and CBD, CBC is the third most prevalent phytocannabinoid. CBC presents notably anti-inflammatory (Delong et al., 2010), sedative, analgesic

(Davis and Hatoum, 1983), anti-bacterial and antifungal properties (Eisohly et al., 1982). CBC is also a potent inhibitor of anandamide uptake, an endogenous ligand of CB receptors (De Petrocellis et al., 2011). CBN is a degradation product of THC and is mostly found in aged *Cannabis*. CBN has a twofold lower affinity for CB1 receptors and a threefold higher affinity for the CB2 receptors, as compared to THC. It thus affects cells of the immune system more than the central nervous system, as reviewed in (McPartland and Russo, 2001). Current cannabinoid-based therapeutic treatments is limited to special cases, i.e., spasticity associated to multiple sclerosis in adult patients, to treat nausea/vomiting linked to cancer therapies, to stimulate appetite in HIV-positive patients (Giacoppo et al., 2014; Lynch and Ware, 2015). Borrelli et al. (2013), after highlighting the beneficial effects of CBG on murine colitis, suggest that this cannabinoid should also be considered for clinical experimentation in patients affected by inflammatory bowel disease.

#### Adverse Health Effects of Cannabinoids

As mentioned earlier, the recreational and medical use of *Cannabis* as well as of THC and other synthetic cannabinoids have also been associated with numerous side effects. Two recent reviews (Volkow et al., 2014; van Amsterdam et al., 2015) notably reported the adverse health effects linked to the use of natural *Cannabis* and synthetic cannabinoids, respectively. When adjusted for confounders such as cigarette smoking, the impact of short- and long-term use appear to be similar for both types of consumption and are directly linked to the level of THC or its synthetic analog. The THC content of recreational *Cannabis* has indeed drastically increased in the last 30 years (from 3% in 1980s to almost 20% now, as reported in **Table 1**), with very low level of the other cannabinoids such as CBD. Effects of short-term use include memory and cognitive deficits, impaired motor coordination, and psychosis. Long-term use of THC has been associated to an increased risk of addiction, cognitive impairment, altered brain development when initial use was done early in adolescence, and an increased risk of chronic psychosis disorder including schizophrenia. The protective role that CBD could play to alleviate these negative effects is now well established and documented (Iseger and Bossong, 2015).

#### Terpenes

Terpenes form the largest group of phytochemicals, with more than 100 molecules identified in *Cannabis* (Rothschild et al., 2005; Brenneisen, 2007). Terpenes are responsible for the odor and flavor of the different *Cannabis* strains. They have therefore likely contributed to the selection of *Cannabis* narcotic strains under human domestication (Small, 2015). Terpenes are classified in diverse families according to the number of repeating units of 5-carbon building blocks (isoprene units), such as monoterpenes with 10 carbons, sesquiterpenes with 15 carbons, and triterpenes derived from a 30-carbon skeleton. Terpene yield and distribution in the plant vary according to numerous parameters, such as processes for obtaining essential oil, environmental conditions, or maturity of the plant (Meier and Mediavilla, 1998; Brenneisen, 2007). Mono- and sesquiterpenes have been detected in flowers, roots, and leaves of *Cannabis*, with the secretory glandular hairs as main production site. Monoterpenes dominate generally the volatile terpene profile (from 3.1 to 28.3 mg g−<sup>1</sup> of flower dry weight, Fischedick et al., 2010) and include mainly D-limonene, β-myrcene, α- and β-pinene, terpinolene and linalool. Sesquiterpenes, and β-caryophyllene and α- humulene in particular, occur also to a large extent in *Cannabis* extracts (from 0.5 to 10.1 mg g−<sup>1</sup> of flower dry weight, Fischedick et al., 2010). Triterpenes have also been detected in hemp roots, as friedelin and epifriedelanol (Slatkin et al., 1971), in hemp fibers as β-amyrin (Gutiérrez and del Río, 2005) and in hempseed oil as cycloartenol, β-amyrin, and dammaradienol (Paz et al., 2014).

Terpenes, along with cannabinoids, have successfully been used as chemotaxonomic markers in *Cannabis,* as they are both considered as the main physiologically active secondary metabolites (Fischedick et al., 2010; Elzinga et al., 2015). When grown in standardized conditions, a significant and positive correlation was found between the level of terpenes and cannabinoids (Fischedick et al., 2010). This may be explained by the fact that mono- and sesquiterpenes are synthesized in the same glandular trichomes in which the cannabinoids are produced (Meier and Mediavilla, 1998). This association was, however, not confirmed on a larger panel of samples coming from different origins (Elzinga et al., 2015).

## Biosynthetic Pathways Leading to the Different Classes of Terpenes

Two different biosynthetic pathways contribute, in their early steps, to the synthesis of plant-derived terpenes (**Figure 2**). Whereas the cytosolic mevalonic acid (MVA) pathway is involved in the biosynthesis of sesqui-, and tri-terpenes, the plastid-localized MEP pathway contributes to the synthesis of mono-, di-, and tetraterpenes (Bouvier et al., 2005). MVA and MEP are produced through various and distinct steps, from two molecules of acetyl-coenzyme A and from pyruvate and D-glyceraldehyde-3-phosphate, respectively. They are further converted to isopentenyl diphosphate (IPP) and isomerised to dimethylallyl diphosphate (DMAPP), the end point of the MVA and MEP pathways. In the cytosol, two molecules of IPP (C5) and one molecule of DMAPP (C5) are condensed to produce farnesyl diphosphate (FPP, C15) by farnesyl diphosphate synthase (FPS). FPP serves as a precursor for sesquiterpenes (C15), which are formed by terpene synthases and can be decorated by other various enzymes. Two FPP molecules are condensed by squalene synthase (SQS) at the endoplasmic reticulum to produce squalene (C30), the precursor for triterpenes and sterols, which are generated by oxidosqualene cyclases (OSC) and are modified by various tailoring enzymes. In the plastid, one molecule of IPP and one molecule of DMAPP are condensed to form GPP (C10) by GPP synthase (GPS). GPP is the immediate precursor for monoterpenes (Kempinski et al., 2015).

# Health Benefits Associated with Terpenes

Terpenes are lipophilic compounds that easily cross membranes and the blood-brain barrier in particular (Fukumoto et al., 2006). They present a wide-array of pharmacological properties, which have recently been described in several reviews (Russo, 2011; Singh and Sharma, 2015). The biological activities of D-limonene, also commonly found in *Citrus* essential oils, have been well described in the literature. It notably exhibits potent anti-cancer, anxiolytic and immunostimulating properties in humans (Komori et al., 1995). β-myrcene, a terpene commonly found in hop, is recognized as a potent anti-inflammatory, analgesic, and anxiolytic component (Cleemput et al., 2009). α-Pinene is an acetylcholinesteral inhibitor, and may thereby aid memory abilities (Kennedy et al., 2011), which could counteract the memory deficits induced by THC. Linalool, commonly found in *Lavandula angustifolia,* possesses similar properties to the ones described for its monoterpene counterparts, i.e., analgesic, anti-anxiety, anti-inflammatory, and anticonvulsant (Russo, 2011). β-caryophyllene, a well-known active principle of black pepper and Copaiba balsam, possesses potent antiinflammatory and gastric cytoprotector activities (Singh and Sharma, 2015). Interestingly, it selectively binds to the CB2 receptor and could therefore technically be considered as a phytocannabinoid (Gertsch et al., 2008). Pentacyclic triterpenes such as β-amyrin and cycloartenol have been shown to possess numerous biological activities including anti-bacterial, antifungal, anti-inflammatory and anti-cancer properties (Vázquez et al., 2012; Moses et al., 2013). These triterpenes are key contributors to the pharmacological properties of numerous medicinal herbs (Kirby et al., 2008; Yadav et al., 2010; Sawai and Saito, 2011).

# Phenolic Compounds

Phenolic compounds, also known as phenylpropanoids, constitute one of the most widely distributed group of secondary metabolites in the plant kingdom. They present more than 10,000 different structures, including phenolic acids, such benzoic and hydroxycinnamic acids, flavonoids such as flavonols and flavones, stilbenes and lignans (Andre et al., 2010). In *Cannabis*, about 20 flavonoids have been identified, mainly belonging to the flavone and flavonol subclasses (Flores-Sanchez and Verpoorte, 2008). These include the *O*-glycoside versions of the aglycones apigenin, luteolin, kaempferol and quercetin, as well as cannflavin A and cannflavin B, which are methylated isoprenoid flavones that are unique to *Cannabis* (**Figure 2**) (Ross et al., 2005). Phenolic amides and lignanamides have also been described in *Cannabis* fruits and roots (Sakakibara et al., 1992; Lesma et al., 2014). The lignanamides belong to the lignan class of compounds and include cannabisin-like compounds (of the types A-, B-, C-, D-, E-, F-, and G) and grossamide (Flores-Sanchez and Verpoorte, 2008). Similar compounds such as cannabisin D, have been described in *Cannabis* leaves, where it was strongly induced upon the UV-C treatment (Marti et al., 2014). Interesting amounts of lignans were recently found in the hydrophilic extract of hemp seeds. The hemp seed lignan profile was shown to be dominated by syringaresinol and medioresinol, followed by secoisolariciresinol, lariciresinol, and pinoresinol (Smeds et al., 2012). Hemp seeds contain, however, about 20-times less total lignans (32 mg of total lignans per 100 g of dry weight) than flax seeds, a well-known source of lignans. Interestingly, the lignan content of hulled hemp seeds represents only 1% of the content in whole seed (Smeds et al., 2012). Nineteen stilbenes have been isolated in *Cannabis* with characteristic structural backbones such as spirans, phenanthrenes and bibenzyls (Flores-Sanchez and Verpoorte, 2008). They include molecules such as cannabistilbene I, IIa and IIb, as well as dihydroresveratrol. Interestingly, bibenzyl stilbenes, including the putative 3-*O*-methylbatatasin, were strongly induced in *Cannabis* leaves by UV radiations (Marti et al., 2014).

# Biosynthetic Pathway Leading to the Different Classes of Phenolic Compounds

Phenolic compounds are produced through the phenylpropanoid pathway in the cytoplasm and are subsequently transported in the vacuole or deposited in the cell wall (**Figure 2**). Routes to the major classes of phenolic compounds involve (i) the core phenylpropanoid pathway from phenylalanine to an activated (hydroxy) cinnamic acid derivative (p-coumaroyl CoA), via the actions of the phenylalanine-ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H, a cytochrome P450) and 4-coumarate-CoA ligase (4CL), as well as specific branch pathways for the formation of (ii) simple esters, lignins and lignans, (iii) flavonoids, (iv) coumarins, and (v) stilbenes (Andre et al., 2009; Naoumika et al., 2010; Docimo et al., 2013) (**Figure 2**). Although the flavonoid pathway has been extensively studied in several plants, there is no specific data on the biosynthesis of flavonoids in *Cannabis*. Generally, lignans such as secoisolariciresinol are produced *in planta* by stereoselective coupling of coniferyl alcohol moieties, via two distinct dirigent proteins, giving rise to (+) or (−) pinoresinol. Each pinoresinol can then be further enantiospecifically reduced to lariciresinol and secoisolariciresinol (Dalisay et al., 2015). The key molecular events associated with the biosynthesis of lignanamides are still unknown. The structure of these molecules suggests, however, a condensation of the precursors tyramine and CoA-esters of coumaric, caffeic, and coniferic acid (Flores-Sanchez and Verpoorte, 2008), followed by an oxidative coupling reaction catalyzed by a dirigent protein, as described for lignans. The flavonoid pathway is initiated by condensation of p-coumaroyl CoA with three molecules of malonyl-CoA (**Figure 2**). Naringenin chalcone is rapidly isomerized by the enzyme chalcone isomerase (CHI) to form naringenin, the branch point of flavonols on one hand and flavones on the other one. Flavanone 3-hydroxylase (F3H) may subsequently hydroxylate naringenin to produce the dihydroflavonol, dihydrokaempferol, which can be further hydroxylated by flavonoid 3 hydroxylase (F3 H) to form dihydroquercetin. Dihydrokaempferol and dihydroquercetin are substrates of flavonol synthase (FLS), which catalyzes the production of the flavonols kaempferol and quercetin, respectively. Naringenin may alternatively be converted to apigenin, by a reaction catalised by a flavone synthase (FNS). Apigenin can be further hydroxylated by a flavonoid 3 hydroxylase (F3 H) to form luteolin which is likely the precursor of the diverse cannflavins (Flores-Sanchez and Verpoorte, 2008).

# Health Benefits Associated with Phenolic Compounds

In plants, phenolic compounds may act as antioxidants under certain physiological conditions and, thereby, protect plants against oxidative stress. In humans, it was shown that there is a correlation between dietary phenolic compound intake and a reduced incidence of chronic diseases such as cancers, cardiovascular and neurodegenerative diseases (Arts and Hollman, 2005), but these positive health effects may not be entirely explained by the phenolic antioxidant properties, as they are poorly bioavailable. Phenolic compounds may induce the up-regulation of endogenous antioxidant enzymes *in vivo*, due to their ability to act as pro-oxidants and generate Reactive Oxygen Species (ROS) (Halliwell et al., 2005). They may also exert their action through non-specific protein binding interactions (Gertsch et al., 2010). The flavones and flavonols found in *Cannabis* exert a wide range of biological effects, including properties shared by terpenes and cannabinoids. They present anti-inflammatory, anti-cancer and neuro-protective properties as reviewed in (Andre et al., 2010). In addition, apigenin has been shown to possess anxiolytic (Murti et al., 2012) and oestrogenic properties (Wang and Kurzer, 1998). The specific cannflavin A et B are potent anti-inflammatory compounds, via inhibition of prostaglandin E2 and 5-lipoxygenase (Werz et al., 2014). Health-related studies concerning lignanamides are scarce and showed *in vitro* anti-inflammatory (Sun et al., 2014) and cytotoxic activities (Cui-Ying et al., 2002). Lignans in general show a wide array of health-promoting properties including antioxidant, antiviral, antidiabetic, antitumorigenic and anti-obesity activities. Interestingly, secoisolariciresinol, lariciresinol and pinoresinol are converted into enterolignans by the anaerobic intestinal microflora, which are mammalian oestrogen precursors (phytooestrogens) (Wang et al., 2010). Due to the structural similarity of enterolignans with mammalian oestrogens, these compounds are potentially interesting for combating some hormonedependent cancers. The mechanisms of action of the lignans are, however, complex, with multiple targets involved (Sainvitu et al., 2012).

# SYNERGISTIC AND ANTAGONISTIC EFFECTS BETWEEN PHYTOCHEMICALS

It is now well accepted that the health benefits of fruits, vegetables and other plant foods are due to the synergy or interactions between the different bioactive compounds or other nutrients present in the whole foods, and not to the action of a sole compound (Liu, 2013). Similarly, *Cannabis*-based therapeutics exert their pharmacological effects in humans via synergistic or antagonistic interactions between the various phytochemicals described above. These interactions may occur through various mechanisms including: (i) bioavailability, (ii) interference with cellular transport processes, (iii) activation of pro-drugs or deactivation of active compounds to inactive metabolites, (iv) action of synergistic partners at different points of the same signaling cascade (multi-target effects) or (v) inhibition of binding to target proteins (Efferth and Koch, 2011). A good example is the stronger muscle-antispastic effect of a *Cannabis* extract compared to pure THC, which represents an important finding for the treatment of multiple sclerosis (Wagner and Ulrich-Merzenich, 2009). Non-THC cannabinoids have shown positive influence on the side effects induced by THC such as anti-anxiety activities. CBD may also reduce the induced cognitive and memory deficits in subjects smoking *Cannabis* (Wright et al., 2013). CBD affects the pharmacokinetics of THC through different mechanisms: (i) by fluidizing the membranes and therefore increasing the penetration of THC in muscle cells, and (ii) by inhibiting the P450-mediated hepatic drug metabolism, which is involved in the degradation and elimination of the molecule (Klein et al., 2011). Terpenes may also alter the pharmacokinetics of THC by increasing the blood-brain barrier permeability. This characteristic has notably been used to patent a transdermal patch, which delivers cannabinoids into the bloodstream by using a terpene as a permeation agent (Smith, 2015). Terpenes may also modulate the affinity of THC for the CB1 receptor and interact with neurotransmitter receptors, which may support contributions of terpenes on cannabinoidmediated analgesic and psychotic effects (McPartland and Russo, 2001; Russo, 2011). In view of the potential of phytocannabinoidterpene synergy, it has been suggested to tailor novel therapeutic treatments such as CBD-terpene extracts to be used against acne, MRSA, depression, anxiety, insomnia, dementia and addiction (Russo, 2011).

Flavonoids may also modulate the pharmacokinetic of THC, via inhibition of the hepatic P450 enzymes (3A11 and 3A4) (McPartland and Russo, 2001; Russo, 2011).

Finally, there is an example of predator-targeted synergy between terpenes and phytocannabinoids in the *Cannabis* plant itself: on one side, the specific mixture of monoterpenes and sesquiterpenes determines viscosity and thereby the stickiness of *Cannabis* exudations necessary to trap the insects, and on the other one, the phytocannabinoid acid acts as potent insecticidal molecules (Sirikantaramas et al., 2005; Russo, 2011).

## *CANNABIS* TRICHOMES: SMALL FACTORIES OF PHYTOCHEMICALS

Trichomes are epidermal protuberances covering the leaves, bracts and stems of plants and some of them, like the glandular trichomes, are capable of secreting (or storing) secondary metabolites as a defense mechanism. Several papers have focused on the characterization of these specialized structures using *-omics* (Wang et al., 2009a; Schilmiller et al., 2010; McDowell et al., 2011; Jin et al., 2014), because their integrated study can favor the development of technologies harnessing their rich biochemical potential (Schilmiller et al., 2008). An *-omics* database (TrichOME; available at: http:// www.planttrichome.org/) enabling comparative analyses in plant trichomes has also been created with the purpose of providing the researchers with the possibility to mine data relative to metabolites, genes, expression profiles (Dai et al., 2010). Additionally, several procedures (in some instances supported by a video demonstration; e.g., Nayidu et al., 2014) for the isolation of trichomes from the leaves of different plant species are available (e.g., Marks et al., 2008; Balcke et al., 2014).

Hemp has different types of trichomes (**Figures 3A–F**) which belong to two categories, i.e., glandular and non-glandular (Happyana et al., 2013). Capitate sessile, capitate stalked and bulbous hemp trichomes are secretory structures (**Figures 3C–F**).

In *Cannabis* THCA is accumulated in the heads (glands) of both capitate-stalked and capitate sessile trichomes, but in the former the content is higher (Mahlberg and Kim, 2004). Notably, in the textile variety, the cannabinoids CBDA and CBCA occur at high concentrations instead of THCA, while the reverse is true for drug strains (Mahlberg and Kim, 2004).

Studies on hemp have demonstrated that THCA is synthesized in the storage cavity and that the enzyme responsible for THCA production, i.e., THCAS, follows a sorting pathway from the secretory cells to the storage cavity (Sirikantaramas et al., 2005). The accumulation in the storage cavity is due to the cytotoxicity of cannabinoids: they induce indeed death via apoptosis, when supplied for 24 h to both hemp and tobacco cell suspension cultures (Sirikantaramas et al., 2005). Heterologous expression of THCAS fused to GFP in tobacco leads to fluorescence of the trichome heads, thereby confirming the localization of the enzyme in the storage cavity (Sirikantaramas et al., 2005).

Depending on their color, hemp glandular trichomes show different secretory phases (Mahlberg and Kim, 2004): the mature secreting gland appears translucent (at this stage the cannabinoid content is the highest), while aging glands are yellow and senescing brown.

According to the current model cannabinoids are produced via terpenes secreted by plastids present in the disk cells and phenols stored in their vacuole (Mahlberg and Kim, 2004): analyses using the electron microscope have shown that oily secretions (most likely terpenes) round in shape are secreted from the plastids (which have the appearance of reticulate bodies). Subsequently vesicles are released into the cavity together with fibrillar matrix originating from the cell walls of the disk cells. The fibrillar matrix is transported to the subcuticular cell wall and contributes to its thickening via yet unidentified mechanisms (Mahlberg and Kim, 2004).

Besides cannabinoids, *Cannabis* trichomes produce other secondary metabolites, namely terpenes (see previous paragraph

FIGURE 3 | Hemp trichome types. (A) Unicellular non-glandular trichome; (B) cystolythic trichomes; (C) capitate sessile trichome; (D) capitate-stalked trichome; (E) simple bulbous trichome; (F) complex bulbous trichome. Images kindly provided by Dr. David J. Potter.

on *Cannabis* phytochemicals), which are responsible for the typical plant aroma (Russo, 2011). Among the *Cannabis* terpenes of low abundance, is nerolidol (0.09% of the total terpene content, Ross and ElSohly, 1996), which, interestingly, has antimalarial and anti-leishmanial effects (reviewed by Russo, 2011). Given the pharmacological importance of these compounds, it would be interesting to devise engineering strategies aiming at either boosting the secondary metabolism, or increasing the density of trichomes in *Cannabis*. Among the possible genetic engineering approaches, it is here worth mentioning two examples recently reported in *Artemisia annua*. We will here discuss only these two examples, as further discussion on how to scale up the production of cannabinoids is presented later in this review.

It has been recently shown that the transformation of *A. annua* with the *rolB* and *rolC* genes of *Agrobacterium rhizogenes* led to plants with an increased content of artemisinin (Dilshad et al., 2015). The *rol* genes are known for their stimulatory action on plant secondary metabolism (Bulgakov, 2008). The study on *A. annua* showed that *rolB* and *rolC* trigger different effects, with *rolB* showing enhanced production with respect to *rolC*. An additional study on *A. annua* has shown that the expression of a β-glucosidase from *Trichoderma reesei* increases glandular trichome density and artemisinin production (Singh et al., 2015). The hydrolytic enzyme favors the release of active plant growth regulators from the conjugates stored in the plastids, thereby favoring trichome formation, as well as biomass production and leaf area (Singh et al., 2015). It would be interesting to devise an engineering strategy aimed at increasing the density of trichomes in *Cannabis,* by adopting a similar strategy. *–Omics* studies on *Cannabis* trichomes will help identify important genes, among which transcription factors (involved in trichome formation), which can be likewise used for engineering approaches.

# *CANNABIS* BIOTECHNOLOGY: CHALLENGES AND PROSPECTS

*Cannabis* is a precious plant with multiple applications, hence the possibility of engineering it genetically to produce useful compounds/raw products is highly valuable. In this section of the review we will: (i) discuss the progress made in *Cannabis in vitro* propagation together with the biotechnological prospects of *Cannabis* genetic engineering, by highlighting the challenges and benefits, (ii) describe the hairy root culture system as a tool for the scalable production of cannabinoids and (iii) discuss the advantages of the *Cannabis* cell suspension culture system.

# *Cannabis In Vitro* Propagation and Transformation

The cultivation of *Cannabis* is severely regulated in many countries; therefore alternative *in vitro* growth techniques are receiving a lot of attention. The *in vitro* cultivation of *Cannabis* is also an advantageous way to preserve cultivars/clones (Lata et al., 2009a) with specific metabolite signatures.

Methods to multiply *C. sativa* plants *in vitro* via stimulation of axillary buds on nodal segments, or induction of adventitious buds in the shoot tips have been described (Lata et al., 2009a; Wang et al., 2009b). It was shown that micro-propagated plants are genetically stable; therefore the method is appropriate and useful for the clonal multiplication of this important crop (Lata et al., 2010).

A protocol has also been developed for the propagation of hemp via the synthetic seed technology. According to this procedure, axillary buds or nodal segments are encapsulated in calcium alginate beads (Lata et al., 2009b, 2011), which can then be stored and subsequently used for clonal propagation of the plant. This system was shown to allow the successful growth of homogeneous and genetically stable *Cannabis* plants even after 6 months of storage (Lata et al., 2011).

To set up a successful *Cannabis* transformation protocol, the mastery of *in vitro* culture techniques is necessary: whether the strategy adopts plant explants or undifferentiated calli as starting material, the regeneration of the whole plant is a mandatory step. Organ regeneration, in particular shoots, can be quite cumbersome and therefore the screening of different plant growth regulator concentrations and combinations has to be carried out to find the right culture medium composition.

*Cannabis sativa* is a notorious recalcitrant plant to transformation, because the regeneration efficiencies are quite low and dependent upon the cultivar, tissue, plant age and growth regulator combination (Slusarkiewicz-Jarzina et al., 2005). As an example, although successful transformation of hemp calli via *Agrobacterium tumefaciens* was reported by Feeney and Punja (2003), the undifferentiated cells failed to regenerate the shoots. The cells were transformed with phosphomannose isomerase and colorimetric assays showed successful expression of the transgene.

Nevertheless some success in hemp regeneration was reported and shown to be linked to the choice of specific plant growth regulators. For example the addition of thidiazuron (TDZ), which has cytokinin-like activity, was shown to increase the development of shoots in hemp explants (Lata et al., 2009a) and in leaf-derived calli of a high yielding THCA clone (Lata et al., 2010). The herbicide DICAMBA was also reported to favor the regeneration of hemp shoots from calli (Slusarkiewicz-Jarzina et al., 2005).

*Cannabis* transformation protocols using plant explants (thereby avoiding the passage to undifferentiated cells) have been described for several important crops (e.g., cotton, Zapata et al., 1999; jute, Saha et al., 2014). Notably, successful transformation of hemp plants was reported by MacKinnon et al. (2001) using shoot tips: the protocol uses shoot tip explants and the regeneration potential of the shoot apical meristem after infection with *A. tumefaciens*. Additionally a patent application was filed describing *Cannabis* transformation using 1–2 cm hypocotyl explants, the plant growth regulators zeatin and 6 benzylaminopurine (BAP) for shoot regeneration (Sirkowski, 2012).

## Hairy Root Cultures for the Production of Cannabinoids

An additional system offering interesting applications for the industrial production of compounds showing pharmaceutical effects in humans is the hairy root system, a type of *Agrobacterium*-transformed plant tissue culture used to study plant metabolic processes. Transformation of hemp and subsequent establishment of hairy root culture has been described by Wahby et al. (2013) using both *A. rhizogenes* and *A. tumefaciens*. In this study hypocotyls were found to be the most responsive tissue for infection. The hairy root system is very interesting for the production of secondary metabolites in medicinal plants (Jiao et al., 2014; Patra and Srivastava, 2014; Wawrosch et al., 2014; Gai et al., 2015; Tian, 2015) or to engineer model plants to secrete industrially valuable metabolites. For example, in tobacco transgenic hairy roots the production of THCA was successfully obtained by expressing hemp THCAS (Sirikantaramas et al., 2007). The hairy root system is characterized by hormone-independent high growth rate and by the same metabolic potential as the original organ (Pistelli et al., 2010). A protocol for the establishment of hairy roots from *Cannabis* callus cultures has also been described (Farag and Kayser, 2015). In this study calli were grown on full-strength B5 medium supplemented with 4 mg/L 1-Naphthaleneacetic acid (NAA) and their potential of cannabinoid production was evaluated. The authors found that after 28 days of cultivation in the dark, a peak could be observed in the accumulation of cannabinoids in culture media supplemented with different concentrations of indole-3-acetic acid (IAA). However, the yield remained below 2 μg/g of dry weight, thereby showing that further optimizations are still required in this field. The induction of rhizogenesis in undifferentiated *Cannabis* cells is important, because it can be performed on calli overexpressing key transcription factors and/or genes involved in the cannabinoid pathway.

The production of cannabinoids in hemp hairy root cultures can be then further implemented with adsorbents to avoid toxicity issues (a more detailed discussion concerning possible ways to avoid toxicity is present in the section dedicated to heterologous plant hosts). In alternative, inducible promoters can be used, like for instance the glucocorticoid-inducible promoter, which was already shown to be effective in inducing a controlled, reversible and dosage-dependent expression of GFP in *Catharanthus roseus* hairy roots (Hughes et al., 2002).

#### *Cannabis* Cell Suspension Cultures for the Production of Cannabinoids

Plant cell suspension cultures offer important advantages, as they can be transformed and then cultivated in bioreactors for the production of useful metabolites (Weathers et al., 2010; Bortesi et al., 2012; Liu et al., 2012; Han et al., 2014). *Cannabis* callus cultures are not able to produce any cannabinoids, irrespective of the chemotypes (drug-, hybrid-, or fiber-type) used as mother plants or growth regulators used in the culture medium (Pacifico et al., 2008). The transformation of hemp cell suspension cultures with genes involved in specific metabolic pathways can offer the possibility of enhancing the production of important classes of metabolites such as cannabinoids but also of others with potential pharmacological use. In this paragraph we will discuss about potential biotechnological approaches to boost the production of cannabinoids in *Cannabis* cell suspension culture.

The increased production of cannabinoids in *Cannabis* cell suspension cultures can be achieved via the expression of transcription factors involved in *Cannabis* gland biochemistry (**Figure 4**). Transcription factors represent a powerful tool in plant metabolic engineering, because of their "cascade" mechanism of action: if master regulators involved in cannabinoid biosynthesis are identified in *C. sativa* trichomes, they could be expressed constitutively or inducibly in *Cannabis* cell suspension cultures. It is important to mention here that two transcription factors belonging to the MYB family were already shown to be preferentially expressed in *Cannabis* glands (Marks et al., 2009) and therefore represent ideal candidates to express. These genes show homology with *Arabidopsis thaliana* MYB112 and MYB12, which are known to be involved in the tolerance to oxidative stress and flavonol biosynthesis, respectively (Marks et al., 2009 and references therein). The expression of these transcription factors in an inducible manner is a strategy worth being tested for the production of cannabinoids. The inducible expression will limit the negative effects caused by the toxicity of the accumulating cannabinoids during the growth of the transformed plant cells (as more thoroughly described in the next section).

In addition to the genetic engineering approach, plant cell suspension cultures can be elicited to boost the production of secondary metabolites. The literature is rich in examples concerning the increased expression of secondary metabolites in plant cells elicited with different factors (reviewed recently by Ncube and Van Staden, 2015). Both biotic and abiotic stress factors can indeed be used to re-direct the plant metabolism: nutrients, light, temperature, fungal elicitors are among the most common factors manipulated.

In hemp suspension cells, elicitation with biotic and abiotic elicitors did not induce an increase in cannabinoids (Flores-Sanchez et al., 2009); however, jasmonic acid was shown to elicit the production of the antioxidant tyrosol (Pec et al., 2010).

It is here worth mentioning the effect of a so far neglected element, silicon (Si). Despite being a non-essential element for plant growth, Si is known to increase plant vigor and to alleviate the effects of exogenous stresses (Epstein, 2009). Very recently Si was shown to alleviate the effects of salt stress and to induce the production of chlorogenic acid in *Lonicera japonica* (Gengmao et al., 2015). Given the stimulatory effects that Si has on plant metabolism, it is interesting to further investigate, from a molecular perspective, the effects of Si supplementation on *Cannabis* secondary metabolite production. Cyclodextrins have also been used in plant cell suspension cultures to enhance the production of various non-polar metabolites such as stilbenes (Yang et al., 2015), phytosterols (Sabater-Jara and Pedreño, 2013) or triterpenes (Goossens et al., 2015). Cyclodextrins are cyclic oligosaccharides consisting of five or more α-D-glucopyranose residues. They are known to form inclusion complexes with lipophilic compounds, including cannabinoids (Hazekamp and Verpoorte, 2006), in their hydrophobic cavity, thereby improving metabolite solubility in an aqueous environment. In addition, cyclodextrins, thanks to their chemical structure similar to that of the alkyl-derived oligosaccharides released from plant cell wall when a fungal infection occurs, act as elicitors of secondary metabolite production (Sabater-Jara and Pedreño, 2013).

It would therefore be worth investigating the effect of cyclodextrins on the production of the non-polar cannabinoids in hemp suspension cell cultures.

# CANNABINOID PRODUCTION IN HETEROLOGOUS PLANT HOSTS: HOW IT CAN BE ACHIEVED AND WHAT SHOULD BE TAKEN INTO ACCOUNT

The expression of genes involved in the cannabinoid biosynthetic pathway in cell suspension cultures of plants other than *Cannabis* represents an interesting alternative for the scalable production of cannabinoids (**Figure 4**). For example synthetic biology could be used to recreate the cannabinoid biosynthetic pathway in heterologous plant cells via the expression of THCAS, together with the upstream enzymes involved in the synthesis of CBG, i.e., the tetraketide synthase (the type III PKS), the aromatic prenyltransferase and the OAC (Gagne et al., 2012). In this respect tobacco bright yellow 2 (BY-2) cells are very interesting expression hosts, given their wide use in plant biotechnology as "workhorse" for the production of recombinant proteins (e.g., Reuter et al., 2014).

The biomimetic production of cannabinoids in heterologous plant hosts is challenging, however, one strategy that is worth taking into account concerns the use of synthetic "metabolons" (Singleton et al., 2014). A "metabolon" is the association of enzymes which carry out a series of sequential reactions in a given pathway. Examples for the occurrence of metabolons exist in plants for pathways involving, e.g., the synthesis of phenylpropanoids (Chen et al., 2014) and the cyanogenic glycoside dhurrin (Nielsen et al., 2008). Entire metabolic pathways can be engineered via the use of synthetic metabolons enabling the association of enzymes in close proximity: this allows a more efficient shunting of intermediates at the active site of enzymes acting in chain (Singleton et al., 2014). One possible way to assemble a synthetic metabolon is via the use of a scaffolding protein enabling the association of the enzymes (Singleton et al., 2014; Pröschel et al., 2015). In the specific case of cannabinoid production, the creation of a synthetic metabolon comprising for instance the type III PKS and OAC (Gagne et al., 2012), together with the aromatic prenyltransferase and the THCAS, can be achieved via (i) the use of dockerin-cohesin modules, or (ii) the metazoan signaling proteins SH3-, PDZ-, GBD binding domains, or (iii) the SpyTag/SpyCatcher domains (recently reviewed by Pröschel et al., 2015).

The assembly of multimodular constructs for expression in plants is no longer an insurmountable challenge, thanks to the development of methods like the Gateway-mediated cloning (reviewed by Dafny-Yelin and Tzfira, 2007), Golden Gate (Binder et al., 2014), GoldenBraid (Sarrion-Perdigones et al., 2011), to name a few.

When cannabinoids are produced in heterologous plant hosts, toxicity effects have to be taken into account, as it was shown that THCA and CBGA cause cell death via apoptosis in cells of *Cannabis* and tobacco BY-2 (Sirikantaramas et al., 2005). For plant cell suspension cultures cultivated in bioreactors, the *in situ* product removal via a two-phase culture system might be useful to favor the accumulation of the toxic metabolites produced in sites which are separated from the cells (Cai et al., 2012) (**Figure 4**). The use of adsorbents in the culture medium can not only sequester the toxic compounds, but also stimulate the secondary metabolite biosynthesis (Cai et al., 2012 and references therein).

One additional approach that can be used to avoid product toxicity in plant cell suspension cultures is artificial compartmentalization (**Figure 4**). This approach has been recently proposed in *A. annua* cell cultures for the production of artemisinin (Di Sansebastiano et al., 2015). The authors induced the formation of an artificial compartment (generated by membranes deriving from endocytosis and the endoplasmic reticulum-vacuole trafficking) via the expression of a truncated SNARE protein, AtSYP51. The creation of an artificial compartment can be used for the production of cannabinoids, because it can trap and stabilize the toxic secondary metabolites until extraction is performed, in a manner analogous to what discussed for artemisinin.

# PERSPECTIVES AND CONCLUSION

Hemp is a unique versatile plant, which can provide high biomass quantities in a short time. Hemp stem is used as a source of woody and bast fibers for the construction and automotive industries, while hemp seeds are used as a source of dietary oil and hemp leaves and flowers as a source of bioactive components.

To date, more than 540 phytochemicals have been described in hemp (Gould, 2015), and their pharmacological properties appear to go much beyond psychotic effects, with the capacity to address needs like the relief of chemotherapy-derived nausea and anorexia, and symptomatic mitigation of multiple sclerosis.

Continuously discovering new prototypes of drugs is of tremendous importance to meet tomorrow's challenges in terms of public health (Atanasov et al., 2015). Nature has already provided a large source of new molecules and new skeletons. A recent review reporting the new drugs available on the market during the last 30 years showed that more than 35% of these new drugs have a direct natural origin. This percentage rises to over 60% if we take into account all the drugs whose structure is inspired by a natural pharmacophore (Newman and Cragg, 2012). *Cannabis* presents a colossal potential for enlarging the library of bioactive metabolites. Compounds can be obtained from hemp trichomes, cell suspension cultures, hairy root systems, or via the biotransformation of THCA or CBDA using fungal, bacterial, or plant cells (Akhtar et al., 2015).

Our increasing knowledge on the key molecular components triggering the diverse phytochemical pathways *in planta* (**Figure 2**), may also allow, through a genetic engineering approach, to further increase the production of specific cannabinoids, terpenes, or phenolic compounds, or to reconstruct the pathway in heterologous systems using a synthetic biology approach. Apart from the importance of studies focused on improving *Cannabis* genetic transformation, it is necessary to know more about the regulatory mechanisms involved in secondary metabolite production in *C. sativa*. For example enzymological and structural studies will help devise protein engineering approaches to improve the catalytic functions of key enzymes (Taura et al., 2007a). However, further studies would still be needed to elucidate other key genes

involved in biosynthetic pathways of, for instance, lessabundant cannabinoid derivatives. For that purpose, the combination of metabolomics with genome-based functional characterizations of gene products would provide an accelerated path to discovering novel biosynthetic pathways to specialized metabolites. Indeed, the functions of numerous genes have been identified and characterized through the correlation of gene expression and metabolite accumulation (Sumner et al., 2015). Classical approaches used focused on the spatial and temporal distribution of the targeted phytochemicals and on the plant transcriptome, as influenced by the developmental stage and environmental stresses. With respect to the resurgence of interest in *Cannabis* phytochemicals nowadays, the results of such studies will be soon available.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

CA was involved in the review writing, J-FH was involved in manuscript refinement, and GG initiated the idea of the review and was involved in the manuscript writing.

#### ACKNOWLEDGMENTS

The authors wish to thank the support by the Fonds National de la Recherche, Luxembourg (Project CANCAN C13/SR/5774202). Laurent Solinhac is gratefully acknowledged for providing the longitudinal cross section image of hemp stem appearing in **Figure 1**. The authors are grateful to Dr David J. Potter (GW Pharmaceuticals Ltd, Salisbury, Wiltshire, UK) for providing the trichome pictures appearing in **Figure 3**.

*and Medicine*, ed. M. ElSohly (New York, NY: Humana Press), 17–49. doi: 10.1007/978-1-59259-947-9\_2


transformation of Isatis tinctoria L. for the efficient production of flavonoids and evaluation of antioxidant activities. *PLoS ONE* 10:e0119022. doi: 10.1371/journal.pone.0119022


gateway technology and transgenic rice cell culture. *Biotechnol. Bioeng.* 109, 1239–1247. doi: 10.1002/bit.24394


deep expressed sequence tag sequencing and proteomics. *Plant Physiol.* 153, 1212–1223. doi: 10.1104/pp.110.157214


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Andre, Hausman and Guerriero. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Gene Inactivation by CRISPR-Cas9 in *Nicotiana tabacum* BY-2 Suspension Cells

*Sébastien Mercx, Jérémie Tollet, Bertrand Magy, Catherine Navarre and Marc Boutry\**

*Institut des Sciences de la Vie, Université Catholique de Louvain, Louvain-la-Neuve, Belgium*

Plant suspension cells are interesting hosts for the heterologous production of pharmacological proteins such as antibodies. They have the advantage to facilitate the containment and the application of good manufacturing practices. Furthermore, antibodies can be secreted to the extracellular medium, which makes the purification steps much simpler. However, improvements are still to be made regarding the quality and the production yield. For instance, the inactivation of proteases and the humanization of glycosylation are both important targets which require either gene silencing or gene inactivation. To this purpose, CRISPR-Cas9 is a very promising technique which has been used recently in a series of plant species, but not yet in plant suspension cells. Here, we sought to use the CRISPR-Cas9 system for gene inactivation in *Nicotiana tabacum* BY-2 suspension cells. We transformed a transgenic line expressing a red fluorescent protein (mCherry) with a binary vector containing genes coding for Cas9 and three guide RNAs targeting *mCherry* restriction sites, as well as a bialaphos-resistant (*bar*) gene for selection. To demonstrate gene inactivation in the transgenic lines, the *mCherry* gene was PCR-amplified and analyzed by electrophoresis. Seven out of 20 transformants displayed a shortened fragment, indicating that a deletion occurred between two target sites. We also analyzed the transformants by restriction fragment length polymorphism and observed that the three targeted restriction sites were hit. DNA sequencing of the PCR fragments confirmed either deletion between two target sites or single nucleotide deletion. We therefore conclude that CRISPR-Cas9 can be used in *N. tabacum* BY2 cells.

*Edited by: Kazuhito Fujiyama, Osaka University, Japan*

#### *Reviewed by:*

*Ezequiel Matias Lentz, ETH Zurich, Switzerland Basavaprabhu L. Patil, Indian Council of Agricultural research–National Research Centre on Plant Biotechnology, India*

> *\*Correspondence: Marc Boutry marc.boutry@uclouvain.be*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 15 October 2015 Accepted: 11 January 2016 Published: 01 February 2016*

#### *Citation:*

*Mercx S, Tollet J, Magy B, Navarre C and Boutry M (2016) Gene Inactivation by CRISPR-Cas9 in Nicotiana tabacum BY-2 Suspension Cells. Front. Plant Sci. 7:40. doi: 10.3389/fpls.2016.00040*

Keywords: CRISPR, Cas9, plants, suspension cells, gene inactivation, gene targeting

# INTRODUCTION

*Nicotiana tabacum* cv. Bright yellow 2 (BY-2) suspension cells have been used as a model in a wide range of cellular and physiological studies such as the cell cycle regulation or the effects of environmental conditions and stresses on the plant physiology (Koukalova et al., 1989; Shaul et al., 1996; Fiserova et al., 2006). Moreover BY-2 cells have been shown to be able to produce recombinant proteins, and thus represent an alternative host in the molecular farming field (De Muynck et al., 2010; Schillberg et al., 2013). Typically, cell cultures are grown in contained bioreactors and thus have the advantage of animal and microbial cultures regarding the process control. Recombinant proteins, such as antibodies, can be secreted to the extracellular medium allowing for a purification step much simpler than if they were retained into the cell. The downstream processing costs are thus lower compared to the whole plant system. However, improvements still have to be made regarding the quality and the production rate. For instance, the humanization of glycosylation is required for glycosylated proteins used in therapy. Inactivation of extracellular proteases might be crucial to avoid degradation of recombinant proteins. The possibility of modifying the expression of genes is also an important tool for more basic projects, e.g., aimed at deciphering molecular aspects of a plant cell. These applied or basic targets can be best achieved by genetic engineering tools resulting in either gene silencing or gene inactivation.

Gene silencing by RNA interference has been largely used in plants as well as in plant cells. This approach suffers from the fact that gene silencing is rarely complete and might not be stable over time. From this point of view, gene inactivation by mutation or deletion is more effective but, except for the collections of knock-out lines in *Arabidopsis*, has rarely been implemented in plants because there was no simple method available. Recently, the CRISPR (clustered regularly interspaced short palindromic repeat)/Cas9 (CRISPR-associated) system has been successfully used in a wide range of plant species (Bortesi and Fischer, 2015) but not in plant suspension cells. The system is based on a short RNA guide (sgRNA) which associates to the Cas9 endonuclease to create a double-stranded break (DSB) in the target genomic DNA. As a consequence, mutations are generated through either error-prone non-homologous end-joining (NHEJ) or homologydirected repair (HDR) of the intended cleavage site. NHEJ has been used to generate mutagenic insertions/deletions often leading to gene inactivation. In this report, we tested the CRISPR/Cas9 system in *N. tabacum* BY-2 cells. As a proof of concept we demonstrate the feasibility of using this system to inactivate a reporter gene (*mCherry*) that had been introduced in the genome.

# MATERIALS AND METHODS

#### Plant Cell Cultures

*Nicotiana tabacum* cv. BY-2 (Nagata et al., 1992) suspension cells were grown in the dark at 25◦C with agitation on a rotary shaker (90 rpm) in liquid MS medium [4.4 g/L Murashige and Skoog salts (MP BIOMEDICALS, Solon, OH), 30 g/L sucrose, 0.2 g/L KH2PO4, 2.5 mg/L thiamine, 50 mg/ml myo-inositol, and 0.2 mg/L 2,4-D, pH 5.8 (KOH)].

Cultures were grown in 50 mL of medium in a 250 mL Erlenmeyer flask and a 5% inoculum was transferred each week into fresh medium. Transformed cells were grown on solid medium supplemented with 20 µg/mL of bialaphos.

#### Generation of the SC6 Transgenic Line

A cDNA coding for the monomeric fluorescent protein *mCherry* gene (Shaner et al., 2004) controlled by the double-enhanced cauliflower mosaic virus 35S promoter was inserted in the pPZP-RCS2-nptII-HIgG2-LoBM2 vector (Magy et al., 2014) which contains an expression cassette for the monoclonal antibody Lo-BM2 (**Figure 1**). The binary vector was transferred into *Agrobacterium tumefaciens* LBA4404virG (van der Fits et al., 2000) by electroporation. *A. tumefaciens* was cultured for 16 h in 50 ml of culture medium supplemented with antibiotics (20 µg/ml rifampicin, 40 µg/ml gentamicin, 50 µg/ml kanamycin) as well as 100 µM acetosyringone. After harvesting by centrifugation at 3,500 *g*, the bacteria were resuspended at an O.D. of 1 (600 nm) in MS medium supplemented with 100 µM acetosyringone and 10 mM MgSO4. The suspension was incubated for 3 h at room temperature with shaking and then 40 ml were mixed with 4 ml of a 6 days old BY-2 cell culture and poured onto solid MS medium containing 100 µM acetosyringone. After 2 days at 25◦C, the cells were washed and cultured in liquid MS medium for 3 days and plated on solid MS media supplemented with 500 µg/ml cefotaxim, 400 µg/ml carbenicillin, 100 µg/ml kanamycin. A line displaying mCherry fluorescence (SC6) was chosen as a target for further work.

#### Cas9 and sgRNA Plasmid Construction and Plant Cell Transformation

pFGC-pcoCas9 was a gift from Jen Sheen (Addgene plasmid # 52256). The three sgRNAs were constructed by overlapping PCR, inserted into the pGEM-T-easy vector (Promega), sequenced, and then introduced into either the AscI (sgRNA1), PacI (sgRNA2), or SbfI (sgRNA3) cloning sites of the pFGC-pcoCas9 binary vector (Li et al., 2013). The vector was transferred into *A. tumefaciens* LBA4404virG (van der Fits et al., 2000) by electroporation. Transformation of *N. tabacum* BY-2 cells was performed as indicated above except that the selection was on 20 µg/ml bialaphos.

#### RNA Extraction and RT-PCR

BY-2 cells (100 mg) were collected 3 days after co-cultivation with *A. tumefaciens*, frozen in liquid nitrogen, and ground in powder. Then, RNA was extracted with the SpectrumTM Plant Total RNA Kit (Sigma–Aldrich). cDNA was synthesized by mixing 0.2 µg RNA, 10 µM oligo dT, 10 µM oligo gRNA-mCherry-RT-R (hybridizing to sgRNAs), and H2O up to 15 µl. After incubation for 5 min at 70◦C, the sample were placed on ice and 5 µl 5x M-MLV buffer (Promega), 1.25 µl dNTPs (10 mM), 25 U RNase inhibitor, 200 UM-MLVRT (Promega), and 16.75 µl H20 were added. The sample was incubated for 1 h at 42◦C. PCR was performed according to the GoTaq<sup>R</sup> DNA Polymerase protocol (Promega).

#### Analysis of Genome Modification

Genomic DNA was extracted from stable transgenic transformants from the bialaphos selection and SC6 nontransformed cells. PCR was performed using primers flanking *mCherry* (**Table 1**) and the amplified fragments were electrophoresed on an ethidium bromide-stained agarose gel (2%). For the restriction fragment length polymorphism (RFLP) analysis, the PCR product was digested with the corresponding enzymes chosen for each target site and electrophoresed on an ethidium bromide-stained agarose gel

copies of the CaMV 35S enhancer; *HC*, heavy chain; *LC*, light chain; T, nopaline synthase polyadenylation sequence (tNOS); aaDa, resistance gene to the

# TABLE 1 | Primers used in this study.

aminoglycosides spectinomycin and streptomycin.


(2%). For further characterization the bands were purified and sequenced.

#### RESULTS

#### Design of the CRISPR/Cas9 System Targeting mCherry in Nicotiana tabacum BY-2 Cells

To test the potential of CRISPR/Cas9 to generate a gene knockout in *N. tabacum* BY-2 cells we obtained a transgenic line (SC6) containing a reporter gene expressing the mCherry fluorescent protein. This line was obtained after transformation of BY-2 cells with a vector previously used to express an antibody (Magy et al., 2014) in which we inserted the *mCherry* gene (**Figure 1**). A transgenic line, SC6, was chosen for its high mCherry expression. Three regions of *mCherry* were targeted by three sgRNAs (**Figure 2A**). We selected the target sites for the presence of a restriction site to facilitate the identification of mutations by an RFLP assay. We expected short INDELs at different target sites but also deletions between two target sites if a break occurs at two sites simultaneously (**Figure 2B**). We constructed the pFGC-Cas9-sgRNA1-2-3 binary vector containing an *Arabidopsis* codon-optimized version of Cas9 controlled by the 35S-PPDK promoter, a *bar* gene for selection, and three sgRNAs (controlled by the U6 promoter) targeting *mCherry* (**Figure 2C**). The sgRNAs of each target site were generated by overlapping PCR and cloned into the pFGC-Cas9 vector. The SC6 cell line was transformed with *A. tumefaciens* carrying pFGC-Cas9-sgRNA1-2-3. Three days after transformation, we sought to determine whether the genes for Cas9 and the sgRNAs were expressed. At that stage, no selection had been applied and transient expression was checked in the whole cell suspension by RT-PCR analysis. Transcripts for both *cas9* and sgRNA were identified (**Figure 2D**). Afterward, the cell suspension was spread on a selection medium containing bialaphos to isolate transformed calli.

integrated in the vector downstream of the U6 transcription promoter. A *bar* gene permits the selection of transformants. (D) Expression analysis by RT-PCR of *Cas9* and the sgRNAs 3 days post-transformation with *A. tumefaciens* carrying the pFGC-Cas9-sgRNA1-2-3 vector. As a control, WT BY-2 cells were transformed with a vector (pPZP-mCherry) expressing mCherry.

# CRISPR/Cas9 Induces INDELs and Fragment Deletion in *mCherry*

Random transformants were transferred twice on a fresh bialaphos selection medium and then checked for the fluorescence of mCherry. Loss of fluorescence was observed in 19 out of 21 transformants. This loss was partial (chimeras) for 15 lines and complete for the other four (**Figures 3A,B**). To confirm that the loss of fluorescence was due to mutations in *mCherry*, genomic DNA extraction was performed followed by PCR amplification of the target region (**Figure 3C**). Seven out

#### FIGURE 3 | Continued

Analysis of genome editing at the *mCherry* locus. Cells of the SC6 line were transformed with pFGC-Cas9-sgRNA1-2-3 targeting *mCherry*. Transformants appeared after 3 weeks on bialaphos selection medium and were transferred twice on new selection medium. (A) Picture of the calli under visible light (left) and fluorescence of mCherry (right). (B) Close-up of four calli: lines 15 (homogenous mCherry fluorescence), 18 (no mCherry fluorescence) 6 and 20 (heterogenous mCherry fluorescence). (C) Genome editing in transformed calli was monitored by PCR amplification of *mCherry.* Deletion between two target sites occurred in seven out of 20 transformants (lines 1, 2, 8, 9, 17, 18, 19). pPZP: amplification of *mCherry* directly from the plasmid pPZP-mCherry. (D) Genotyping of nine transformants by RFLP analysis. The PCR fragments from lines 1 to 9 displayed in (C) were subjected to digestion by the indicated restriction enzymes.

of 20 lines tested showed, in addition to the undeleted fragment, shortened fragments, which correspond to deletions between two target sites. To determine whether INDELs occurred within the full size fragments, RFLP analysis was performed directly on the PCR products of the first nine clones (**Figure 3D**). Three out of nine lines exhibited a band partially resistant to ApaLI digestion, indicating the creation of short INDELs in target 1. Five out of the nine lines were mutated in target 2, and five in target 3. CRISPR-Cas9 together with the three sgRNAs thus created mutations in all the lines except for line 15, in which a loss of fluorescence was not observed. However, the lines were not homogenous for the mutations. Instead, for most of them, a mix of INDELs at two or three targets was identified. To further characterize mutations, four PCR fragments (three with a shorter size and one with a full size) were cloned and sequenced (**Figure 4**). The full size fragment displayed a single base deletion in target 3. One shortened fragment resulted from a deletion between targets 2 and 3, and the last two fragments, from a deletion between targets 1 and 3.

#### DISCUSSION

In the present study, we showed that CRISPR/Cas9 can be used as a powerful tool for engineering the *N. tabacum* BY-2 genome. Previous studies showed the possibility to induce mutations in the genome of various plants or plant protoplasts (Jiang et al., 2013; Li et al., 2013; Gao et al., 2015), but plant suspension cells had not yet been used as a target.

Since the efficiency of the CRISPR/Cas9 in *N. tabacum* BY-2 cells was unknown, we targeted the marker gene *mCherry*, the expression of which could be monitored by fluorescence. However, selection of non-fluorescent transformants was not necessary as among the 21 transformants randomly chosen, four had lost their mCherry fluorescence. However, 15 chimeric calli were also observed. This indicates that Cas9 had not yet provoked any genomic break in some of the cells. An explanation is that the initial transformant was actually a mix of two transformation events, one of which did not lead to the expression of Cas9 and/or sgRNAs, either because of incomplete transfer of the T-DNA, or because of a position effect. Another explanation is that after transformation, a first target site was hit after the first cell division, resulting in a chimeric callus.

On the whole, RFLP analysis indicated that all three *mCherry* targets were hit. In addition, the deletion of a fragment between two targets was also observed, which shows that in some cases, Cas9 had hit two targets before non-homologous end-joining reparation had occurred at each site. This observation suggests that homology-directed repair might be feasible in suspension cells. When examining individual transformants, heterogeneity was usually observed, with a mix of hits at one, two, or three targets. However, no case was found where the three sites were homogenously mutated, and only one case out of nine was found where a single site was homogenously mutated (line 6 at target 3). As hypothesized before, this suggests that after transformation

with *A. tumefaciens* cell division of the transformed cells usually occurred before a target was mutated by Cas9. In this case, mutations appear over time when the callus grows, resulting in chimeric clones. However, homogenous mutant lines can be obtained by subcloning. When too diluted, plant suspension cells do not grow, thus preventing isolated colonies from being obtained. However, an alternative consists of mixing transformed cells with an excess of feeding wild-type cells and selecting isolated calli on a selection medium (Kirchhoff et al., 2012).

#### CONCLUSION

Targeting three sites of a gene of interest, we showed that it is possible to use CRISPR/Cas9 to knock-out this gene. Besides single mutations at one of the targets, double breaks and deletion of the intervening sequence were also observed, opening the way to homologous recombination by homology-directed repair.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

SM performed most of the research, analyzed the data and wrote the manuscript. JT and BM obtained the SC6 transgenic line. CN supervised the research and wrote the manuscript. MB conceived and designed the project and wrote the manuscript.

#### FUNDING

This work was supported in part by grants from the Service Public de Wallonie (WBHealth and FIRST Spin-off), the Belgian National Fund for Scientific Research and the Interuniversity Poles of Attraction Program (Belgium). SM and JT are recipients of a fellowship from the Fonds pour la Formation à la Recherche dans l'Industrie et l'Agriculture (Belgium).

host species and culture conditions. *Plant Biotechnol. J.* 12, 457–467. doi: 10.1111/pbi.12152


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Mercx, Tollet, Magy, Navarre and Boutry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Transient Glyco-Engineering to Produce Recombinant IgA1 with Defined *N*- and *O*-Glycans in Plants

*Martina Dicker1, Marc Tschofen1, Daniel Maresch2, Julia König1, Paloma Juarez3, Diego Orzaez3, Friedrich Altmann2, Herta Steinkellner1 and Richard Strasser1\**

*<sup>1</sup> Department of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences, Vienna, Austria, <sup>2</sup> Department of Chemistry, University of Natural Resources and Life Sciences, Vienna, Austria, <sup>3</sup> Institute of Molecular and Cellular Plant Biology, Spanish Research Council Agency – Polytechnic University of Valencia, Valencia, Spain*

The production of therapeutic antibodies to combat pathogens and treat diseases, such as cancer is of great interest for the biotechnology industry. The recent development of plant-based expression systems has demonstrated that plants are well-suited for the production of recombinant monoclonal antibodies with defined glycosylation. Compared to immunoglobulin G (IgG), less effort has been undertaken to express immunoglobulin A (IgA), which is the most prevalent antibody class at mucosal sites and a promising candidate for novel recombinant biopharmaceuticals with enhanced anti-tumor activity. Here, we transiently expressed recombinant human IgA1 against the VP8\* rotavirus antigen in glyco-engineered -XT/FT *Nicotiana benthamiana* plants. Mass spectrometric analysis of IgA1 glycopeptides revealed the presence of complex biantennary *N*-glycans with terminal *N*-acetylglucosamine present on the *N*-glycosylation site of the CH2 domain in the IgA1 alpha chain. Analysis of the peptide carrying nine potential *O*-glycosylation sites in the IgA1 alpha chain hinge region showed the presence of plant-specific modifications including hydroxyproline formation and the attachment of pentoses. By co-expression of enzymes required for initiation and elongation of human *O*-glycosylation it was possible to generate disialylated mucin-type core 1 *O*-glycans on plant-produced IgA1. Our data demonstrate that -XT/FT *N. benthamiana* plants can be engineered toward the production of recombinant IgA1 with defined human-type *N*and *O*-linked glycans.

Keywords: monomeric IgA, antibody, protein glycosylation, *N*-glycosylation, *O*-glycosylation, glyco-engineering, recombinant glycoprotein, plant-made pharmaceuticals

# INTRODUCTION

*Citation:*

*Edited by: Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

> *Reviewed by: Martine Gonneau,*

*\*Correspondence: Richard Strasser*

*Specialty section: This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science Received: 09 October 2015 Accepted: 08 January 2016 Published: 29 January 2016*

*richard.strasser@boku.ac.at*

*Institut National de la Recherche Agronomique, France Anna Maria Salzano, National Research Council, Italy*

*Dicker M, Tschofen M, Maresch D, König J, Juarez P, Orzaez D, Altmann F, Steinkellner H and Strasser R (2016) Transient Glyco-Engineering to Produce Recombinant IgA1 with Defined Nand O-Glycans in Plants. Front. Plant Sci. 7:18. doi: 10.3389/fpls.2016.00018*

Therapeutic antibodies are an increasingly important class of recombinant biopharmaceuticals for treatment of cancer or infectious diseases. Currently the majority of antibody-based therapeutics on the market or in clinical trials are monoclonal and of the IgG subtype. Immunoglobulin A (IgA) – the most prevalent antibody class at mucosal sites in the human body – is gaining attention as therapeutic agent for treatment of infections and cancer (Bakema and van Egmond, 2011;

**Abbreviations:** GALNAc-T2, polypetide-*N*-acetylgalactosaminyltransferase 2.

Boross et al., 2013; Reinhart and Kunert, 2015). However, the full potential of recombinant IgAs as therapeutic antibodies is still not fully explored, owing to the fact that robust recombinant production is challenging and that IgAs are extensively glycosylated proteins. IgGs contain a single *N*-glycosylation site in the heavy chain Fc region, which is heterogeneously glycosylated. Importantly, distinct IgG *N*-glycan structures can affect the antibody activity in therapeutic settings through the modulation of Fc receptor binding on different immune cells (Jefferis, 2009; Ferrara et al., 2011; Lin et al., 2015). Recent advances in glyco-engineering in diverse expression hosts allow the recombinant generation of IgG glycoforms for structurefunction studies and comparison of efficacy (Strasser et al., 2014). The first glyco-engineered IgG-based monoclonal antibodies have already been approved for therapeutic applications (Beck and Reichert, 2012; Ratner, 2014; Goede et al., 2015).

Despite the recognized importance of *N*-glycosylation for IgG function, comparatively little is known about the role of glycosylation for the biophysical and immunological properties of IgA as well as on the *in vivo* role of different IgA glycoforms. The two IgA isotypes (IgA1 and IgA2) carry two to five *N*-glycosylation sites on the alpha chain, and the hinge region of IgA1 is modified with several mucin-type *O*-glycans (Royle et al., 2003; Deshpande et al., 2010; Huang et al., 2015) (**Figure 1A**). Aberrantly *O*-glycosylated IgA1 is involved in pathogenesis of IgA nephropathy in humans (Novak et al., 2012). In this autoimmune disease the galactose-deficient *O*-glycans in the IgA1 hinge region are recognized by circulating autoantibodies resulting in the formation of immune complexes, followed by aggregation or disposition, which is a major cause of renal failure. Moreover, the joining (J) chain in the dimeric IgA variant contains a single *N*-glycan and the secretory component (SC) in the secretory IgA (sIgA) variant is heavily *N*-glycosylated (Huang et al., 2015). Hence, the generation of recombinant IgA variants bearing homogeneous and well-defined glycans is highly challenging. Nonetheless, such glyco-engineering approaches are imperative to study the contribution of *N*- and *O*-glycosylation to IgA function. Furthermore, in the case of therapeutic applications abnormal glycosylation such as galactose-deficient IgA1 *O*-glycans variants should be avoided to reduce the risk of adverse side effects like the formation of anti-glycan antibodies (Suzuki et al., 2015).

Plants are considered attractive hosts for the production of recombinant biopharmaceuticals. For example, a phase I clinical trial of tobacco-derived HIV neutralizing antibody 2G12 has recently been completed (Ma et al., 2015). The tobaccorelated species *Nicotiana benthamiana* has emerged as promising host for expression of recombinant glycoproteins with tailormade *N*- and *O*-glycan modifications (Strasser et al., 2014). IgG variants with different types of customized *N*-glycans have been successfully generated in this expression platform (Strasser et al., 2008, 2009) and the ZMAPPTM monoclonal IgG antibody cocktail for treatment of Ebola infections is produced in glycoengineered -XT/FT *N. benthamiana* (Castilho et al., 2011; Qiu et al., 2014). In the -XT/FT *N. benthamiana* the expression of the β1,2-xylosyltransferase (XT) and core α1,3-fucosyltransferase (FT) have been downregulated by an RNAi approach (Strasser et al., 2008). In addition, mucin-type *O*-glycosylation has been generated on *N. benthamiana*-produced mucin-derived peptides and on recombinant erythropoietin fused to Fc (Castilho et al., 2012; Yang et al., 2012).

Stable expression of a recombinant IgA (CaroRXTM) to prevent dental caries has initially been performed in *Nicotiana tabacum* plants (Ma et al., 1995). This pioneering work demonstrated the potential of plants for the production of functional recombinant sIgA. More recently, the production of IgA variants in different plant species has been reported, but there are only few data available on the glycosylation of recombinant plant-produced IgA variants (Karnoup et al., 2005; Paul et al., 2014; Westerhof et al., 2014). Moreover, a comparison of *N*and *O*-glycans and attempts to modulate them *in vivo* have not been described yet. In this study, we investigated the capability of glyco-engineered wild-type and -XT/FT *N. benthamiana* to produce recombinant IgA1 with specific glycans. We transiently expressed recombinant IgA1 against rotavirus (Juárez et al., 2012; Juarez et al., 2013) and performed an analysis of the *N*-glycan composition found in the CH2 domain as well as of the *O*-glycan structures in the hinge region of the alpha chain. Our data provide important insights for future strategies aiming at the production of IgA1 variants with customized glycosylation and enhanced therapeutic effectiveness.

## MATERIALS AND METHODS

#### Cloning and Expression Vectors

The human *N*-acetylglucosaminyltransferase II (GnTII) coding sequence was amplified by PCR from human cDNA (Mucha et al., 2002) with the primers Hs-GnTII1F (5- - TATATCTAGAATGAGGTTCCGCATCTACAAACG-3- ) and Hs-GnTII2R (5- -tataGGATCCTCACTGCAGTCTTCTATAACT TT-3- ). The PCR product was digested with XbaI/BamHI and ligated into XbaI/BamHI digested vector pPT2M (Strasser et al., 2005) to generate pPT2M-GnTII. Binary expression vectors for mucin-type *O*-glycan formation and CMP-sialic biosynthesis and Golgi transport were available from previous studies (Castilho et al., 2010, 2012). The multigene expression vector encoding the alpha chain (αC), the lambda light chain (λC), the human SC and the human joining chain (JC) was described in detail recently (Juarez et al., 2013).

#### Plant Material and Transient Protein Expression

*Nicotiana benthamiana* -XT/FT plants which have strongly reduced expression of β1,2-XT and core α1,3-FT (Strasser et al., 2008) were grown in a growth chamber at 24◦C with a 16 h light/8 h dark photoperiod. Five-week-old plants were used for syringe-mediated agroinfiltration into leaves as described previously (Strasser et al., 2008). The recombinant sIgA1 was either expressed alone or co-infiltrated with the vectors encoding the proteins for *N*-glycan modification or mucin-type *O*-glycosylation (OD600 of 0.2 for all agrobacteria containing sIgA1 vectors and OD600 of 0.05 for all constructs involved in glycan modifications).

terminator sequence; JC, joining chain; αC, alpha chain; λC, lambda light chain. (C) GnTII: human *N*-acetylglucosaminyltransferase II used for *N*-glycan engineering. (D) Enzymes for sialylated core 1 formation: GalNAc-T2, human polypeptide GalNAc-transferase 2; C1GalT1, *Drosophila melanogaster* core 1 β1,3-galactosyltransferase; ST6GalNAc, *Mus musculus* α2,6-sialyltransferase III/IV; ST3Gal-I, human α2,3-sialyltransferase I.

#### Protein Extraction and Purification

To purify sIgA1, 40–50 g of leaf material was harvested four days post infiltration, frozen in liquid nitrogen and disrupted using a mixer mill. The homogenized leaf material was dissolved in 2 ml extraction buffer (0.1 M Tris, 0.5 M NaCl, 1 mM EDTA, 40 mM ascorbic acid, pH 6.8) per g of plant material and incubated for 30 min at 4◦C. The extract was centrifuged at 27000 × *g* for 30 min at 4◦C, passed through a filter with a pore size of 12–8 μm and centrifuged again. To clear the extract it was ran through filters with pore sizes of 12–8 μm, 3–2 μm, 0.45 μm, and 0.22 μm. A chromatography column was packed with 1 ml of SSL7/Agarose (InvivoGen) and washed with 5 ml of PBS. The cleared extract was applied to the column with a flow rate of ∼1 ml/min. Afterwards the column was washed again with 5 ml PBS and the protein was eluted with 5 ml of 0.1 M glycine pH 2.5. The collected eluate fractions were immediately neutralized to pH 7.0 with 1 M Tris pH 8.0 and the protein content was analyzed using the Micro BCA Protein Assay Kit (Thermo Scientific Pierce) and bovine serum albumin (BSA) as a standard.

To isolate intercellular fluid (IF) infiltrated leaves were carefully detached and submerged in a beaker filled with buffer (0.1 M Tris pH 7.5, 10 mM MgCl2, 2 mM EDTA). The beaker was positioned in a desiccator and vacuum was applied for 2 min. The vacuum infiltrated leaves were inserted into a 50 ml falcon tube with a fine plain-weave cotton fabric (muslin bandage) inside to prevent damage of the leaves and centrifuged at 1000 × *g* for 20 min at 4◦C. The IF was collected from the bottom of the tube and directly used for further analysis or concentrated using micro spin-columns.

## Immunoblot Analysis and Endoglycosidase Treatment

SDS-PAGE was performed in 8–10% polyacrylamide gels run under reducing or non-reducing conditions. Separated proteins were either detected by Coomassie Brilliant Blue staining or by transfer onto nitrocellulose membranes (Hybond-C, GE Healthcare) and subsequent detection with different antibodies and chemiluminescence-based detection reagents. Detection of the αC was done using a polyclonal goat anti-human alpha chain specific antibody (Sigma–Aldrich), the λC was detected using a rabbit anti-human lambda light chain antibody (Sigma–Aldrich) and the SC was detected using a rabbit anti-human SC antibody (Gentaur).

Crude protein extracts, SSL7-prufied sIgA1 or IF fractions were subjected to enzymatic deglycosylation. For endoglycosidase H (Endo H) digestion 1.5 μl of 10x Glycoprotein Denaturing Buffer (NEB, 5% SDS, 0.4 M DTT) were added to 13.5 μl of sample. This mix was incubated for 10 min at 95◦C. After the sample had cooled down on ice, 2 μl G5 Buffer (NEB), 1 μl Endo H (NEB) and 2 μl ultrapure water were added and this mix was incubated for 60 min at 37◦C. For the peptide: *N*-glycosidase F (PNGase F) digestion 1.5 μl of denaturing buffer were added to 13.5 μl of sample. This mix was incubated for 10 min at 95◦C. After the sample had cooled down on ice, 2 μl G7 Buffer (NEB), 1 μl PNGase F (NEB), and 2 μl NP-40 were added and this mix was incubated for 60 min at 37◦C.

#### *N*- and *O*-Glycan Analysis

To analyze the sIgA1 *N*- and *O*-glycans, purified protein (1–5 μg) was separated by SDS-PAGE under reducing conditions, and polypeptides were detected by Coomassie Brilliant Blue staining. The corresponding band was excised from the gel, followed by *S*-alkylation with iodoacetamide and digestion with sequencing grade modified trypsin (Promega) or a combination of trypsin and endoproteinase Glu-C (Roche). The peptide mixture was analyzed using a Dionex Ultimate 3000 system directly linked to a QTOF instrument (maXis 4G ETD, Bruker) equipped with the standard ESI source in the positive ion, DDA mode (=switching to MS/MS mode for eluting peaks). MS-scans were recorded (range: 150–2200 m/z, spectra rate: 1 Hz) and the six highest peaks were selected for fragmentation. Instrument calibration was performed using ESI calibration mixture (Agilent). For separation of the peptides a Thermo BioBasic C18 separation column (5 μm particle size, 150 × 0.360 mm) was used. A gradient from 97% solvent A and 3% solvent B (Solvent A: 65 mM ammonium formiate buffer, B: 100% acetonitrile) to 32% B in 45 min was applied, followed by a 15 min gradient from 32% B to 75% B, at a flow rate of 6 μL/min.

The analysis files were converted to XML files using Data Analysis 4.0 (Bruker) and used to perform MS/MS ion searches with MASCOT (embedded in ProteinScape 3.0, Bruker) using the manually annotated and reviewed Swiss-Prot database. Peptide MS/MS data were evaluated against the target sequence using X! Tandem (www.thegpm.org/tandem/) with the following settings: reversed sequences no; check parent ions for charges 1, 2, and 3 yes; models found with peptide log e lower -1 and proteins log e lower -1; residue modifications: oxidation M, W and deamidation N, Q; isotope error was considered; fragment type was set to monoisotopic; refinement was used with standard parameters; fragment mass error of 0.1 Da and ±7ppm parent mass error; fragment types b and y ions; maximum parent ion charge of 4; missed cleavage sites allowed was set to 2; semi-cleavage yes.

#### Jacalin Purification

Jacalin/Agarose (InvivoGen) was washed three times with PBS and centrifuged at 1500 × *g* for 4 min. SSL7/Agarose-purified IgA1 was diluted with PBS, added to the washed Jacalin/Agarose and incubated for 1.5 h at 4◦C with slowly inverting. After incubation, the mix was centrifuged at 3220 × *g* for 10 min, the supernatant was removed and the Jacalin/Agarose was transferred to a spin column. The agarose was washed three times with 500 μl PBS and subsequent centrifugation at 1500 × *g* for 1 min. The bound protein was eluted by the addition of 50 μl elution buffer containing 0.1 M α-D-galactose in PBS and subjected to SDS-PAGE and immunoblotting.

#### RESULTS

#### Transient Expression of Recombinant sIgA1 in *N. benthamiana* Wild-Type and Glyco-Engineered *-*XT/FT Plants

Recombinant sIgA1 against rotavirus was transiently expressed via agro-infiltration in *N. benthamiana* wild-type and the glyco-engineered -XT/FT plants using the previously described GoldenBraid multigene expression system (Juarez et al., 2013). To obtain efficient co-expression of all proteins the four transcriptional units encoding the IgA1 alpha chain (αC), the lambda light chain (λC), the human SC, and the human JC were expressed from a single vector (**Figure 1B**). Extracts from leaves were taken three days after infiltration and subjected to SDS-PAGE and immunoblotting. Bands corresponding to the expected size of the alpha chain (∼55 kDa), light chain (∼23 kDa), and the SC (∼68 kDa) were found (**Figure 2A**). By contrast, the JC could not be detected on immunoblots (data not shown). The used anti-alpha chain antisera reacted not only with the alpha chain, but also with the SC. SDS-PAGE and immunoblotting under non-reducing conditions revealed the presence of presumably monomeric IgA1 variants (signals larger than 130 kDa and co-migrating with the lower bands of the standard) in the total protein extract. Additional bands at approximately ∼70 and 130 kDa were also detected with the antibody against the SC and very likely represent free monomeric and dimeric SC. Higher molecular weight complexes resembling dimeric or sIgA1 (compare with the top bands of the standard) were hardly detectable in all tested protein fractions (**Figure 2B**). Importantly, these observations were similar in wild-type as well as in glyco-engineered -XT/FT plants.

To further characterize the IgA variants, we purified them from leaves using binding to SSL7-agarose and investigated the presence of different IgA chains in the IF. The SSL7 purified protein consisted mainly of the alpha chain and the light chain (**Figure 2C**), no additional band corresponding to the J chain (∼17 kDa) was detected. The IF displayed the SC as predominant sIgA1-derived protein band (**Figure 2A**). In addition to the band corresponding to the SC, a faint unidentified additional band (slightly larger than 100 kDa) was also detected with anti-SC antibody. Interestingly, the alpha chain was not found in the IF suggesting that the monomeric IgA1 is not secreted to the apoplast. Together these findings indicate that the expressed sIgA1 is not efficiently assembled under the used conditions and that unassembled SC and some unincorporated lambda light chain are secreted to the apoplast.

#### Characterization of *N*-Glycosylation Status of the Alpha Chain and Secretory Component

Next, we examined the glycosylation status of the alpha chain and the SC by endoglycosidase digestions and subsequent SDS-PAGE and immunoblotting. Extracts from infiltrated wildtype and -XT/FT leaves were digested with Endo H and PNGase F to distinguish between oligomannosidic (Endo H and PNGase F sensitive), core fucose-free complex (Endo H resistant, PNGase F sensitive), and core fucose-containing *N*-glycans (insensitive to both enzymes). While Endo H digestion of the alpha chain did not result in any mobility shift, a small shift was observed in the PNGase F digested samples (**Figure 3A**). The shift was comparable in wild-type and -XT/FT extracts indicating the presence of core fucose-free complex *N*-glycans on the IgA1 alpha chain. This result was confirmed by digestion of the SSL7-purified protein samples (**Figure 3B**). IF and leaf extracts were also analyzed for SC *N*-glycosylation. In wild-type, a small mobility shift was visible upon PNGase F digestion and immunoblotting in both the IF and the extract. By contrast, in -XT/FT the mobility shift was much larger indicating that the majority of *N*-glycans

of the SC are core fucose-containing complex *N*-glycans (**Figure 3C**).

To determine the *N*-glycan composition more in detail, IgA1 was purified via SSL7-agarose and subjected to SDS-PAGE and Coomassie blue staining. The band corresponding to the alpha chain was excised, trypsin digested and peptides were analysed by LC-ESI-MS. The glycopeptide corresponding to the single *N*-glycosylation site in the CH2 domain was identified and found to harbor a single dominant peak (**Figure 4A**). The mass of this peak corresponds to a glycopeptide with a complex *N*-glycan furnished with a single terminal GlcNAc residue and a single pentose, presumably β1,2-linked xylose. Other peaks were reminiscent of a truncated glycan lacking terminal GlcNAc (MMX: Man3XylGlcNAc2) and different oligomannosidic (Man6 to Man9: Man6GlcNAc2 to Man9GlcNAc2) *N*-glycans. Fully processed complex *N*-glycans were only found in very low amounts (e.g., GnGnX: GlcNAc2Man3XylGlcNAc2; **Figure 4A**). Consistent with the PNGase F digestion, no fucose-containing peaks could be detected on the CH2 domain glycopeptide in *N. benthamiana* wild-type. The *N*-glycan profile from the -XT/FT-derived CH2 domain showed the incompletely processed MGn (GlcNAcMan3GlcNAc2) structure as the major peak and lower amounts of peaks corresponding to truncated (MM: Man3GlcNAc2), complex (GnGn: GlcNAc2Man3GlcNAc2) and oligomannosidic (Man6 to Man9) *N*-glycans. As expected, glycans with β1,2-xylose and core α1,3-fucose were not found in the glyco-engineered -XT/FT line.

The presence of large amounts of *N*-glycan structures with a single terminal GlcNAc residue in *N. benthamiana* suggests that the *N*-glycan in the CH2 domain is incompletely processed by the Golgi-resident GnTII or attached GlcNAc residues are cleaved off in post-Golgi compartments by β-hexosaminidases (Strasser et al., 2007; Castilho et al., 2014). To test the first possibility, human GnTII was cloned into a binary plant-expression vector (**Figure 1C**) and co-expressed with the sIgA1 multigene vector. As a result, peaks corresponding to complex *N*-glycans with two terminal GlcNAc residues were considerably increased in wildtype as well as in -XT/FT (**Figure 4B**). This result indicates that GnTII activity is a major limiting factor that leads to incompletely processed *N*-glycans on this *N*-glycosylation site of the IgA1 alpha chain. This limitation can at least in part be overcome by transient expression of the corresponding human glycosyltransferase.

In contrast to the plant-derived alpha chain *N*-glycans, the CH2 domain glycopeptide from a human serum standard displayed processing on both branches resulting in the formation of sialylated and galactosylated biantennary *<sup>N</sup>*-glycans (**Figure 4C**). The glycopeptide derived from the alpha chain tailpiece could also be identified in the human IgA standard. All identified peaks were sialylated and contained fucose (**Figure 4D**). Despite several attempts using different proteolytic digestions (trypsin or Glu-C plus trypsin), a peptide or glycopeptide containing the second *N*-glycosylation site in the alpha chain tailpiece could not be identified in our plant-derived

samples. Consequently, the *N*-glycosylation status of this site remains unknown.

Since the SC could not be co-purified by SSL7-affinity purification we isolated IF from leaves of wild-type plants, extracted the corresponding band from SDS-PAGE and analyzed the trypsin digested sample for glycopeptides. In total four glycopeptides from the SC were identified, one of them harboring two glycosylation sites. All identified glycopeptides displayed a similar *N*-glycan profile (**Figure 4E** and data not shown) with a major peak corresponding to GnGnXF (GlcNAc2Man3XylFucGlcNAc2) and smaller amounts of incompletely processed (MGnXF: GlcNAcMan3XylFucGlcNAc2) and truncated *N*-glycans (MMXF: Man3XylFucGlcNAc2). All these *N*-glycans were processed in the Golgi and contained xylose and fucose residues.

# *O*-Glycan Analysis of the IgA1 Hinge Region

Plant *O*-glycosylation differs significantly from mammals as plants do not have a functional mucin-type *O*-glycosylation pathway (Castilho et al., 2012; Yang et al., 2012). Plants, on the other hand, can convert proline residues adjacent to *O*-glycosylation sites into hydroxyproline (Hyp; Taylor et al., 2012). Serine residues next to specific Hyp-sequence motifs may be modified with single galactose and Hyp residues and can be extensively modified with arabinose chains or arabinogalactans (Seifert and Roberts, 2007; Basu et al., 2013; Saito et al., 2014). The presence of Hyp and arabinose chains in the hinge region has been described previously for maize seed-derived human IgA1 (Karnoup et al., 2005). To monitor the prolyl-hydroxylation and potential plant-specific *O*-glycosylation we analyzed the glycopeptide corresponding to the hinge region from IgA1 expressed in *N. benthamiana* wild-type and -XT/FT plants. For this purpose, transiently expressed IgA1 was purified from leaves using SSL7-agarose and tryptic peptides were subjected to mass spectrometric analysis. The peptide derived from the IgA1 alpha chain (HYTNPSQDVTVPCPVPSTPPTPSPSTPPTPSPSCCHPR) was analyzed for the presence of post-translational modifications. In **Figure 5A**, the spectra with peaks assigned to proline/Hyp conversions are shown. The observed heterogeneity in the MSspectra indicates that proline residues in this region are partially converted into Hyp by plant prolyl-hydroxylases. A search for glycosylated variants of the peptide revealed the presence of modifications corresponding to the incorporation of pentose sugars (presumably arabinoses; **Figure 5B**). The modifications were comparable between wild-type and -XT/FT plants being in agreement with the hypothesis that modulation of the *N*-glycan processing pathway does not interfere with *O*-glycan modifications.

# Generation of Mucin-Type *O*-Glycans on Plant-Expressed Human IgA1

Three to six mucin-type *O*-glycans are commonly attached to the nine potential *O*-glycosylation sites in the hinge region of human IgA1. The major structures are mucin-type core 1 and sialylated core 1 *O*-glycans. In IgA nephropathy, a long term chronic kidney disease in humans, *O*-glycans are mostly galactose-deficient and recognized by anti-glycan antibodies leading to unwanted immune complex formation. For therapeutic applications it is therefore crucial to produce recombinant IgA variants with human-type sialylated core 1 *O*-glycans to avoid any adverse side effects and loss of functionality (Suzuki et al., 2015). To investigate whether the hinge region of human IgA1 can be furnished with defined mucin-type *O*-glycans when expressed in *N. benthamiana*, we performed *O*-glycan engineering. To this end, sIgA1 was co-expressed with different enzymes for initiation and elongation of mucin-type *<sup>O</sup>*-glycans (**Figure 1D**). For initiation of mucin-type *O*-glycosylation we chose to express human GalNAc-T2 which we have previously used for *in planta O*-glycan biosynthesis on the single *O*-glycosylation site of EPO-Fc (Castilho et al., 2012). As can be seen in

FIGURE 4 | *N*-glycan analysis of sIgA1. (A) Mass spectra of the tryptic glycopeptide from the CH2 domain of sIgA1 expressed in *N. benthamiana* wild-type (wt) or -XT/FT (-XF). The amino acid sequence of the identified peptide is highlighted. The *N*-glycosylation site is underlined. (B) sIgA1 was transiently co-expressed with human GnTII and analyzed as mentioned in (A). (C) The corresponding Glu-C/trypsin digested glycopeptide from human serum IgA. (D) The spectrum of the tailpiece glycopeptide of the alpha chain from human serum IgA. (E) Spectra from two glycopeptides of the SC derived from the IF of wild-type plants. A detailed explanation of the used *N*-glycan abbreviations can be found at the ProGlycAn homepage (http://www.proglycan.com/index.php?page=pga\_nomenclature). The graphical depictions of glycan-structures follow the style of the Consortium for Functional Glycomics (http://www.functionalglycomics.org/static/consortium/ Nomenclature.shtml).

**Figure 6A**, co-expression of sIgA1 with this single human enzyme (without any additional mammalian proteins like a UDP-GalNAc transporter) resulted in modification of the hinge

region peptide with an additional HexNAc monosaccharide (Tn antigen-like structure). Co-expression of the *Drosophila melanogaster* β1,3-galactosyltransferase (C1GalT1) led to the incorporation of additional hexoses suggesting the successful formation of core 1 *<sup>O</sup>*-glycan structures (T antigen; **Figure 6B**).

The T antigen structure is recognized by the lectin jacalin which is commonly used for purification of human IgA1. Previously it was demonstrated that recombinant jacalin does not react with plant-produced sIgA1 which is normally devoid of any galactose or GalNAc residues (Fernandez-del-Carmen et al., 2013). Here, we tested whether a commercially available jacalin recognizes glyco-engineered sIgA1 purified from plants. Jacalin-agarose was incubated with different sIgA1 glycoforms and binding was tested by immunoblotting. Jacalin binding was observed for sIgA1 modified with GalNAc-T2 and for sIgA1 modified with GalNAc-T2 and C1GalT1. By contrast, no binding was observed for unmodified sIgA1 or for sIgA1 that was coexpressed with human β1,4-galactosyltransferase (Strasser et al., 2009) that acts predominately on *<sup>N</sup>*-glycans (**Figure 6C**).

Finally, to generate disialylated core 1 structures, the predominant *O*-glycan on serum-derived IgA1, we co-expressed sIgA1 with GalNAc-T2, C1GalT1, and the mammalian sialic acid biosynthesis pathway consisting of three enzymes (UDP-*N*acetylglucosamine 2-epimerase/*N*-acetylmannosamine kinase; *N*-acetylneuraminic acid phosphate synthase and CMP-*N*acetylneuraminic acid synthetase) for CMP-sialic acid formation and the CMP-sialic acid transporter for transport of the activated nucleotide sugar into the Golgi (Castilho et al., 2012). The MSspectra showed distinct peaks corresponding to the generation of structures with HexNAc, hexoses, and *N*-acetylneuraminic acid (NeuAc; **Figure 7A**), which are similar to the structures found on human serum IgA1 (**Figure 7B**). In summary, these data show that different mucin-type *O*-glycans can be successfully generated on the hinge region of *N. benthamiana*-expressed sIgA1.

#### DISCUSSION

Glycosylation of the single IgG Fc-*N*-glycan has a huge impact on Fc-receptor binding leading to alterations in effector functions such as antibody dependent cellular cytotoxicity (ADCC). While the relevance of antibody glycosylation for effector functions has been realized some time ago (Lifely et al., 1995; Umaña et al., 1999), more recent *in vivo* and *in vitro* glyco-engineering approaches have resulted in a much deeper understanding of antibody glycan-structure-function relationships (Forthal et al., 2010; Ferrara et al., 2011; Ahmed et al., 2014; Lin et al., 2015; Subedi and Barb, 2015). As a consequence the Fc glycans are now categorized as critical quality attributes by industry (Reusch and Tejada, 2015). Despite this documented importance for IgG the role of glycosylation for other immunoglobulins is less well understood. Crucial for further developments and novel applications are suitable tools to manipulate and control the glycan composition on different immunoglobulins including IgMs and IgAs. Glyco-engineering has been very successfully applied to plants in the past (Strasser et al., 2014) and the great potential of *N. benthamiana* for production of therapeutic IgMs has recently been demonstrated (Loos et al., 2014). Here, we characterized *N*- and *O*-glycans from sIgA1 produced in wild-type and glyco-engineered -XT/FT plants and provided strategies toward the formation of defined *N*- as well as *O*-glycans that can be used in the future for extensive functional studies.

We initially aimed to produce a sIgA1 variant and coexpressed all four involved protein chains from a single expression construct (Juarez et al., 2013). Unexpectedly, we obtained mainly monomeric IgA1 variants indicating that the assembly to full sIgA1 variants was not efficient. One possible factor that influences the formation of dimeric and sIgA variants

could be the limited expression of the J chain. In our immunoblot experiments, we were not able to monitor the expression of the J chain. Even the additional co-expression of the human J-chain did not result in a detectable J chain incorporation (data not shown). Alternatively, the fate of the alpha chain tailpiece could also affect the assembly of the dimeric or secretory form. We obtained good coverage of the human IgA1 alpha chain in the proteolytically digested peptide pools (data not shown), but were not able to detect the glycosylated or unglycosylated peptide corresponding to the C-terminal tailpiece. While we might have missed the (glyco)peptide during analysis it is also possible that the C-terminal end of the alpha chain is cleaved off in plants. Another recent study has detected difference in IgA alpha chain mobility by immunoblotting which was proposed to result from partial *N*-glycosylation of the tailpiece when transiently expressed in *N. benthamiana* (Westerhof et al., 2014). As previous studies in mammals have indicated an important role of the *N*-glycan in the tailpiece for J chain incorporation (Atkin et al., 1996; Sørensen et al., 2000), an effect of the altered C-terminal end on sIgA1 formation is plausible. Future studies will aim to address the nature of *N*-glycosylation in the tailpiece and its contribution for dimeric and sIgA1 formation in plants.

Analysis of the IF revealed that only the SC and light chain are present in considerable amounts, while most of the alpha chain and assembled IgA1 remains in the cells. A similar scenario of inefficient secretion of assembled IgA variants has been described for three IgA1 variants when transiently expressed in *N. benthamiana* wild-type (Westerhof et al., 2014). While the final subcellular location of the described IgA1 antibodies was not determined, another study reported the accumulation of sIgA mainly in the vacuoles of *N. benthamiana* leaves (Paul et al., 2014). However, in contrast to our findings, the sIgA in the later study showed predominately oligomannosidic *N*-glycans indicating a different subcellular trafficking route that bypasses the Golgi apparatus. We detected high amounts of truncated or incompletely processed complex *N*-glycans lacking core fucose on the *N*-glycosylation site in the CH2 domain of the alpha chain. In wild-type *N. benthamiana*, recombinant glycoproteins such as IgGs (Strasser et al., 2009), EPO-Fc (Castilho et al., 2012), α1-antitrypsin (Castilho et al., 2014), or IgM (Loos et al., 2014) that travel through the Golgi are very efficiently processed and frequently modified with both β1,2-xylose and core α1,3 fucose residues. Although not directly shown by site-specific glycopeptide analysis, the lack of core α1,3-fucosylation on IgA1 has also been observed based on PNGase F digestions and immunoblots by Westerhof et al. (2014). This uncommon lack of core α1,3-fucosylation is very likely caused by protein intrinsic features and less dependent on the expression host as the same glycopeptide from human serum IgA (**Figure 4C**) displays also reduced levels of core fucose. Moreover, recombinant IgA1 produced in murine myeloma or Chinese hamster ovary cells harbors also considerable amounts of complex *N*-glycans devoid of core fucose (Yoo et al., 2010). Interestingly, a supportive role of core α1,3-fucosylation on IgG Fc-glycan processing was recently discovered by analysis of the plant-produced cetuximab IgG1 antibody (Castilho et al., 2015).

Another difference to recombinant IgG expressed in *N. benthamiana* is the presence of incompletely processed or truncated complex *N*-glycans. These structures may derive either from inefficient GnTII activity during *N*-glycan processing in the Golgi or from post-Golgi action of plant β-hexosaminidases (Strasser et al., 2007). Co-expression of human GnTII converted a significant portion of the oligosaccharide into fully processed complex *N*-glycans indicating a limitation in endogenous GnTII activity. Whether β-hexosaminidases play an additional role in generation of *N*-glycan microheterogeneity like it has been shown for α1 antitrypsin (Castilho et al., 2014) remains to be shown in the future.

#### *O*-Glycan Engineering: Challenges and Future Goals

The analysis of the IgA1 hinge region from wild-type or -XT/FT plants revealed the presence of plant-type Hyp

formation and minor amounts of additional sugar residues. These posttranslational modifications have also been previously described for recombinant IgA1 expressed in maize seeds (Karnoup et al., 2005), on EPO-Fc (Castilho et al., 2012), on mucin-type glycopeptides expressed in *N. benthamiana* (Pinkhasov et al., 2011; Yang et al., 2012) and on moss-produced EPO (Parsons et al., 2013). All of these proteins have exposed proline residues next to *O*-glycosylation sites. The presence of these non-human modifications on therapeutic IgA1s may significantly affect the product quality and cause unwanted immune reactions. Therefore strategies are needed to avoid Hyp formation. The most promising approach is the elimination of the responsible prolyl-4-hydroxylase activity. Targeted disruption of a specific prolyl-4-hydroxylase in moss resulted in the removal of prolyl-hydroxylation on moss-produced recombinant EPO (Parsons et al., 2013). Given the success of the xylosyl- and fucosyltransferase knockdown in -XT/FT (Strasser et al., 2008) a similar strategy should also be feasible for elimination of unwanted plant-specific modifications related to *O*-glycans.

Compared to *N*-glycans, engineering of mucin-type *O*-glycosylation is highly challenging as a mammalian-like mucin-type *O*-glycan biosynthesis pathway is absent from plants (Strasser, 2013). The *de novo* synthesis of defined *O*-glycan structures in plants requires the coordinated expression of several different mammalian proteins in the secretory pathway of plants. However, the knowledge of factors that control mucin-type

*O*-glycan biosynthesis in mammals is incomplete (Bennett et al., 2012). The initiation of mucin-type *O*-glycosylation, for example, is carried out by the large family of mammalian polypeptide GalNAc-transferases with 20 members in humans. Efficient transfer of GalNAc to multiple sites often requires the activity of different polypeptide GalNAc-transferase isoforms. Despite some recent progress, the acceptor substrate specificity of individual members from this family is still largely unclear (Steentoft et al., 2013; Kong et al., 2015). Here, we used the human GalNAc-T2, which is the key enzyme for IgA1 *O*-glycosylation initiation (Iwasaki et al., 2003). Our findings indicate that GalNAc can be efficiently transferred to Ser/Thr residues present in the IgA1 alpha chain hinge region when co-expressed with human GalNAc-T2 in plants. While further elongation with galactose and sialic acid led to human IgA1-like *O*-glycan structures, the incorporation of sialic acid was not very efficient. In the future, the biosynthesis of disialylated core 1 *O*-glycans can be further optimized by the use of multigene vectors for expression of all glycosyltransferases, by more precise subcellular targeting of the mammalian enzymes (Strasser et al., 2014) as well as by the use of transgenic *N. benthamiana* lines that stably express parts of the pathway (e.g., all the genes for CMP-sialic acid synthesis; Castilho et al., 2008). Together these advances will pave the way for detailed functional analysis of individual IgA glycoforms with the ultimate aim to generate recombinant glycoprotein

#### REFERENCES


therapeutics in *N. benthamiana* with reduced adverse side effects and maximized efficacy.

#### AUTHOR CONTRIBUTIONS

MD, MT, DM, and JK performed the research, MD, MT, DM, PJ, DO, FA, HS, and RS provided analytical reagents/tools and analyzed data, MD, MT, and RS designed the research, and RS wrote the paper.

#### FUNDING

This work was supported by a grant from the Austrian Federal Ministry of Transport, Innovation and Technology (bmvit) and Austrian Science Fund (FWF): TRP 242-B20 and by the Austrian Research Promotion Agency (Laura Bassi Center of Expertise "Plant produced Bio-Pharmaceuticals" Grant Nr. 822757).

#### ACKNOWLEDGMENT

We thank Alexandra Castilho for cloning of pPT2M-HsGnTII and for helpful discussions.

in plants. *J. Biol. Chem.* 287, 36518–36526. doi: 10.1074/jbc.M112.4 02685


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Dicker, Tschofen, Maresch, König, Juarez, Orzaez, Altmann, Steinkellner and Strasser. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Seed-Specific Expression of Spider Silk Protein Multimers Causes Long-Term Stability

#### *Nicola Weichert, Valeska Hauptmann, Christine Helmold and Udo Conrad\**

*Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany*

Seeds enable plants to germinate and to grow in situations of limited availability of nutrients. The stable storage of different seed proteins is a remarkable presumption for successful germination and growth. These strategies have been adapted and used in several molecular farming projects. In this study, we explore the benefits of seedbased expression to produce the high molecular weight spider silk protein FLAG using intein-based *trans*-splicing. Multimers larger than 460 kDa in size are routinely produced, which is above the native size of the FLAG protein. The storage of seeds for 8 weeks and 1 year at an ambient temperature of 15◦C does not influence the accumulation level. Even the extended storage time does not influence the typical pattern of multimerized bands. These results show that seeds are the method of choice for stable accumulation of products of complex transgenes and have the capability for long-term storage at moderate conditions, an important feature for the development of suitable downstream processes.

Keywords: seed expression, spider silk proteins, intein, protein *trans*-splicing, tobacco

## INTRODUCTION

Seeds have evolved because of their unique properties, providing important features to plants to survive and to propagate even under harsh environmental conditions. Seeds enable plants to germinate and to grow in situations of limited availability of nutrients. The presence of storage products in seeds is necessary for their functionality. Among a plethora of compounds of different classes, storage proteins play an important role. Throughout long periods of dormancy, storage proteins remain intact (Golovina et al., 1997). Seeds and their compartments contribute to the competiveness of different plant species. The stable storage of different seed proteins is a remarkable presumption (Boothe et al., 2010). These strategies have been adapted and used in several molecular farming projects [for review, see Stöger et al. (2005)]. Recombinant antibodies, i.e., single chain Fv antibodies, have been produced in tobacco seeds under the control of a seed-specific faba bean legumin promoter (Fiedler and Conrad, 1995). Storage protein promoters, as well as other seed-specific promoters, such as the USP promoter in dicots, are well suited for seed-based production combined with ER retention (Fiedler et al., 1997). These seeds can be stored for at least 1 year without loss in the amount and activity of the transgenic protein (Fiedler and Conrad, 1995). Stable storage at ambient conditions is an important feature, because it allows the development of harvesting/downstream processing strategies that do not need a long-term cooling chain or a direct production/extraction/purification process.

#### *Edited by:*

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Alessandro Vitale, CNR – National Research Council of Italy, Italy Lorenzo Frigerio, University of Warwick, UK F. Javier Arias, University of Valladolid, Spain*

> *\*Correspondence: Udo Conrad conradu@ipk-gatersleben.de*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 13 October 2015 Accepted: 06 January 2016 Published: 28 January 2016*

#### *Citation:*

*Weichert N, Hauptmann V, Helmold C and Conrad U (2016) Seed-Specific Expression of Spider Silk Protein Multimers Causes Long-Term Stability. Front. Plant Sci. 7:6. doi: 10.3389/fpls.2016.00006*

Spider-silk proteins have been a target of molecular farming since 2001 (Scheller et al., 2001). The enormous interest in this biopolymer is caused by its extraordinary properties, such as high levels of toughness, tensile strength, and elasticity (Vollrath and Knight, 2001; Craig, 2004). The capture spiral silk, also called flagelliform silk, can be stretched extremely far before rupture. This high level of elasticity is required to ensure prey capture (Vollrath and Edmonds, 1989; Köhler and Vollrath, 1995). The perfect dissipation of kinetic impact energy of flying prey is a requirement for capture fibers to withstand the relative high velocity of flying prey (Römer and Scheibel, 2008). The flagelliform silk consists of only one protein, called FLAG. The elastic properties are thought to be based on the presence of a GPGGX consensus motif in a high number of repeats. Helical GGX repeats are also typical consensus repeats of this protein (Hayashi et al., 1999; Hayashi and Lewis, 2000). These two consensus elements together are responsible for the elasticity and flexibility of the flagelliform silk (Ohgo et al., 2006). A common feature of all known silk proteins from the *arthropoda* is a high molecular weight of more than 250 kDa. This evolutionary convergence between unrelated species is striking. Therefore, the enormous size of all these proteins is anticipated as a necessary prerequisite for the extraordinary mechanical properties (Lewis, 2006). In spider silk proteins, motifs conducive for inter- and intrachain interactions are common, and chain end defects are rare events (Ayoub et al., 2007). The recombinant production of spider silk proteins in pro- and eukaryotic expression systems limits the maximal size of these proteins. Here, the genetic instability of these highly repetitive proteins and the limited availability of frequently used amino acids, as well as the corresponding t-RNAs, are possible reasons. Synthetic spider silk proteins of native size can only be produced by a metabolically modified *Escherichia coli* strain (Xia et al., 2010). More slowly growing organisms, such as plants, are used to overcome these protein size limitations associated with t-RNA and amino acid availability. The maximal size achieved was approximately 100 kDa (Scheller et al., 2001, 2004). Post-translational multimerization methods have been chosen to further increase the size of recombinant spider silk protein derivatives. Purified spider silk-ELP fusion proteins from tobacco leaves were enzymatically multimerized by transglutamination. Layers formed by highly cross-linked spider silk-ELP fusion proteins were associated with a high elastic indentation modulus and, therefore, higher toughness and stiffness of layers formed by multimerized plant-based spider silk protein derivatives were expected (Weichert et al., 2014). We developed a general system for the production of highly repetitive proteins in plants. Protein *trans-*splicing by inteins was used to assemble protein subunits *in planta* (Yang et al., 2003; Kempe et al., 2009); for review see Evans et al. (2005). Inteins are autocatalytically excised from precursor proteins and fuse the flanking exteins together (Perler, 1998). A few of inteins from bacteria have been described (Perler et al., 1994). An intein from cyanobacteria (Pietrokovski, 1996) has been demonstrated to function in plants (Evans et al., 2000; Yang et al., 2003; Kempe et al., 2009). A comprehensive

description of *in vivo* applications of intein-mediated protein splicing is given by Topilina and Mills (2014). Hauptmann et al. (2013) demonstrated that multimers of at least the native size of the spider silk protein FLAG could be produced by intein-based *trans*-splicing and purified from tobacco leaves. Purified and desalted FLAG multimers formed microfibers after drying, thus demonstrating their potential as future biomaterials.

Several applications of spider-silk-derived biopolymers in the field of engineering and technology are discussed in the literature (Kluge et al., 2008; Hardy and Scheibel, 2009). A possible medical application is the use of spider silk particles for the controlled delivery of protein drugs (Hofer et al., 2012). The Ancient Greeks used cobwebs for wound healing when covering bleeding lesions (Gerritsen, 2002). Spider silks can enhance axonal regeneration (Radtke et al., 2011), serve as a scaffold for human cell growth (Widhe et al., 2010), and support the proliferation of fibroblasts and keratinocytes (Wendt et al., 2011). Cytocompatibility is an important prerequisite for any medical use of biomaterials. A plant-produced synthetic spidroin fused with a hundred repeats of elastin-like-peptides (ELP) has been shown to be non-toxic and to enhance the proliferation of human chondrocytes and prevent dedifferentiation (Scheller et al., 2004). Cytocompatibility assays with plant-produced spidroin-ELP biopolymers gave no indication of spidroin-derived cytotoxicity, and no hemolytic effects have been detected (Hauptmann et al., 2015).

In the present paper, we questioned whether the benefits of seed-based production could be extended to the production of high molecular weight spider silk proteins. High molecular weight spider silk proteins could have superior mechanical properties combined with non-cytotoxic and non-hemolytic behavior. We also considered whether intein-based *trans*-splicing also functions in seeds, and we demonstrated that spider silk proteins of native size could be produced in seeds.

#### MATERIALS AND METHODS

#### Construct Design

The 1149 bp *unknown seed protein* (*usp)* promoter (Zakharov et al., 2004) was PCR-amplified using 5 - CGAGTCGACATTTTTACATGATATAATG-3 and 5 -CGT CCATGGACTGGCTATGAAGAAATTATAATC-3 primers. The resulting PCR product was introduced into the *Hin*cII and *Nco*I restriction sites of a pRTRA 15-based plasmid described by Hauptmann et al. (2013). This plasmid contained the complete *IntC::Flag::IntN* gene construct, including the *LeB4* legumin signal peptide, ER retention signal KDEL, the c-myc-tag and the *CaMV35S* terminator. The synthetic *InteinC::Flag::InteinN (IntC::Flag::IntN)* gene construct was based on *Flag* gene motifs from publicly available *Nephila clavipes* cDNAs (GenBank accession nos. AF027972 and AF027973) and the Intein-encoding sequence from *Synechocystis* sp. gene *DnaB* (UniProtKB/Swiss-Prot accession no. Q55418; Hauptmann et al., 2013). The complete *Flag* expression cassette (USP-FIC) was inserted into the binary vector pCB301-Kan (Xiang et al., 1999; Scheller et al., 2001) via the *Hin*dIII site, resulting in the expression plasmid USP-FIC/pCB301-Kan.

# Production of FLAG Overexpressing Plants

The binary plasmid USP-FIC/pCB301-Kan was transformed into the *Agrobacterium tumefaciens* strain C58C1 (pGV2260; Deblaere et al., 1985) by electroporation. For stable transgene expression in two different tobacco varieties, *Nicotiana tabacum* cv. Samsun NN (SNN) and *N. tabacum* cv. Petit Havana, plants were transformed by agroinfection based on the leafdisk method (Horsch et al., 1985) and elaborated by Floss and Conrad (2010). Tobacco leaf disks were submerged for 1 h in overnight-grown, liquid, *Agrobacterium* culture and plated on Murashige-Skoog (MS) agar for 2 days at 24◦C in the dark. Infected explants were transferred to NBKC agar (MS medium containing 0.2 mg/L α-naphthalene acetic acid, 1 mg/L 6-benzylaminopurine, 50 mg/L kanamycin, and 500 mg/L cefotaxime). Every 10–14 days, the plantlets were transferred to fresh NBKC agar until differentiation. The developing transgenic plants were cultured on MS agar containing 50 mg/L kanamycin and were selected by immunoblotting using an anti-c-myc antibody (Evan et al., 1985). Recombinant protein expressing plants were grown in greenhouses to maturity for further propagation. Seeds were analyzed by anti-c-myc immunoblotting for overexpression of the target proteins.

# Seed Material

Mature tobacco seeds, as well as developing seeds at defined developmental stages [18 days after flowering (DAF), 21 DAF], were harvested. Immature seed material was immediately frozen in liquid nitrogen and stored at −80◦C. Mature seed material was stored at 15◦C with 49% humidity.

## SDS-PAGE and Immunoblotting Analysis

For analysis of transgenic plants, seed material was ground in seed extraction buffer (50 mM Tris pH 8.0, 200 mM NaCl, 5 mM EDTA, 0.1% Tween). SDS sample buffer (Gahrtz and Conrad, 2009) was added in a 1:1 ratio. The homogenate was incubated at 95◦C for 10 min and was cleared by centrifugation (30 min, 4◦C, 12,000 rpm). The total protein content of the supernatant was determined using the Bradford assay (Bio-Rad, Germany). Seed extracts were separated by reducing SDS-PAGE (3 or 4–10% polyacrylamide gradient), were electrotransferred to a nitrocellulose membrane and immunodetection was performed as described by Conrad et al. (1998) using anti c-myc antibodies (Evan et al., 1985). Tobacco seed proteins were separated by SDS-PAGE and stained by Coomassie Brilliant Blue R-250 (SERVA GmbH, Germany). The accumulation analysis in a semiquantitative manner was done by help of different concentrations of an anti-TNF-VHH-100xELP standard (Conrad et al., 2011). One c-myc tag is connected with 72 kDa protein, whereas in all FLAG multimers one c-myc tag is always connected with 37.6 kDa protein (Hauptmann et al., 2013). A FLAG multimer band corresponding to a standard band, therefore, always corresponds to about half of the protein amount of the standard band. We roughly estimated the FLAG content by counting the corresponding bands according the different standard amounts and summarized the results for every lane. Extracts from 200 seeds were separated in each lane. We separated extracts from given numbers of seeds per lane, estimated the fresh weight per seed (70 μg per seed for cv. Samsun NN and 65 μg per seed for cv. Petit Havana) and calculated the transgenic protein per fresh weight.

# RESULTS

#### FLAG Multimers are Stably Accumulated in Tobacco Seeds

A synthetic FLAG gene coding for a monomer of 37.6 kDa (Hauptmann et al., 2013) was cloned into a seed-specific expression vector providing ER retention by providing a signal peptide and the N-terminal KDEL motif (**Figure 1A**). The seed-specific expression was driven by the USP promoter proven for overexpression of transgenic proteins in seeds (Fiedler et al., 1997). The synthetic FLAG protein sequence is based on *N. clavipes* FLAG sequence motifs (Hayashi and Lewis, 2000; Hauptmann et al., 2015). The expression cassette was cloned into a suitable shuttle vector (pCB301- Kan, see Materials and Methods), agrobacteria were transformed and stably transformed tobacco plants were produced by an appropriate protocol (see Materials and Methods). Two different tobacco variants, *N. tabacum* cv. Petit Havana and *N. tabacum* cv. Samsun NN, were transformed. *N. tabacum* cv. Petit Havana plants flower more early and, therefore, seeds ripen also earlier than *N. tabacum* cv. Samsun NN seeds (8 days; **Figure 2B**). We wanted to see if this benefit of shorter seed propagation time influences the accumulation levels and/or multimerization. Among 45 Samsun NN T0 transformants 26 showed transgene accumulation and among 55 Petit Havana T0 transformants 18 showed transgene accumulation. The different accumulation levels in T1 seeds are exemplarily shown in **Figure 1B**. In general, more lines with T1 seeds accumulating transgenic proteins comparable to line 28 (nine lines) have been identified in Samsun NN compared to Petit Havana (one line; data not shown). Distinct multimeric bands starting with potential FLAG dimers and ending with multimers above the separation power of a 4–10% polyacrylamid gradient SDS-PAGE (above 500 kDa) are visible, which shows, that intein-based splicing functions well in ripe seeds and that at least native-sized spider silk proteins could be produced. Two lines, USP-FIC 28 (*N. tabacum* cv. Samsun NN) and USP-FIC 49 (*N. tabacum* cv. Petit Havana), were selected as the best high producers from each construct and were further propagated by self-pollination. Equal amounts of seed extract from each of the five sublines were investigated according to the expression of FLAG multimers (**Figure 2**). Multimeric proteins from the monomer molecular weight up to much more than 500 kDa were detected in each lane. The transgene inheritance and the accumulation level were stable in both lines. According to the accumulation level,

the best line was a *N. tabacum* cv. Samsun NN line. We analyzed the accumulation level in seeds of the lines USP-FIC 28 (T3 seeds) and and USP-FIC 49 (T2 seeds) in a semiquantitative manner (see Materials and Methods; **Figure 4**). We applied extract amounts related to seed numbers and calculated the accumulation of transgenic multimers in relation to the fresh weight. We roughly estimated 190 μg FLAG multimers per g seed (fresh weight) for USP-FIC 28 and 20 μg FLAG multimers per g seed (fresh weight) for USP-FIC 49. To learn more about the protein splicing process in developing seeds, we harvested seeds from transgenic plants with different genetic backgrounds and different seed propagation times (see above) during the ripening process to analyze the recombinant protein accumulation. The USP promoter causes the expression of transgenic proteins in tobacco seeds from 10 DAF, with a first maximum at day 17 (Fiedler et al., 1997). Therefore, we selected 18 DAF, 21 DAF and ripe seeds (**Figure 3**), extracted them and analyzed the extracts on a 4–10% polyacrylamid gradient SDS-PAGE and c-myc immunodetection based on extracts from 31.2 seeds per lane independent on plant and age of the seeds to normalize the results according to the fast growing seed protein amount during ripening. In both genetic backgrounds, there was a smear at approximately 100 kDa at 18 DAF and several distinct bands at 21 DAF, but at this time point, they do only partly reflect the expected pattern of different multimerization

protein per lane, Western blotting and immunodetection based on the c-myc tag; kDa, kilodalton. (B) Transgenic USP-FIC line 49 with a genomic background of *N. tabacum* cv. Petit Havana as well as the corresponding wild type plants showed a faster vegetative growth and flowered 8 days earlier than Samsun NN-genome-based FLAG overexpressing plants of USP-FIC line 28/10 and its corresponding wild type plants.

stages in Petit Havana (monomer, dimer and trimer, labeled in **Figure 3**), whereas in Samsun NN ripening seeds a prominent band at the size of the smear shown in lane 1 is visible.

#### Transgenic Tobacco Seeds Containing FLAG Multimers Could be Stably Stored Without Loss of the Transgenic Proteins

One major benefit of seed-based production of functional proteins as antibody fragments is the stability of these proteins in shape and function over a long time at room temperature during seed storage (Fiedler and Conrad, 1995). We wanted to test whether the spider silk multimers are also stable at ambient temperature. T3 seeds of USP-FIC 28/10/11 (*N. tabacum* cv. Samsun NN) and T2 seeds of USP-FIC 49 (*N. tabacum* cv. Petit Havana) were stored at 15◦C and 49% humidity (standard conditions for tobacco seed storage at the Genebank Gatersleben)

for 8 weeks, extracted and analyzed (**Figure 4**). For both types of seeds, we showed that the storage of seeds for 8 weeks at ambient temperature did not influence the pattern of multimeric bands. We also stored T1 USP-FIC 28 (*N. tabacum* cv. Samsun NN) seeds for 1 year at the ambient conditions mentioned above and analyzed spider silk accumulation in comparison to freshly harvested T2 seeds of the same line. Even after this extended storage time, a typical pattern of multimerized bands and a high accumulation level were observed (**Figure 4C**). These results indicate that seeds are the method of choice for stable accumulation of products of complex transgenes, including the capability of long-term storage at moderate conditions.

## Accumulation of Spider Silk Multimers in the ER does not Influence Seed Ripening and Major Seed Protein Content

We did not observe an obvious influence of the spider silk transgene on the development of neither Samsun NN nor Petit Havana lines (**Figure 2B**). High accumulation of anti-hapten scFv in the ER of tobacco seed cells (until 2.6% TSP) did not influence the tobacco seed proteins in ripe tobacco seeds (Phillips et al., 1997). Therefore, we also analyzed the major proteins in ripe T3 seeds of the line USP-FIC 28 and ripe T2 seeds of the line USP-FIC 49 compared to the seed proteins of their corresponding wild type cultivars. The major seed protein analysis by polyacrylamid gel electrophoresis and Coomassie staining gives no arguments for any influence of the spider silk accumulation to seed development (**Figure 5**).

# DISCUSSION

c-myc tag. kDa, kilodalton.

Seeds can provide stable expression of therapeutic proteins, as shown for several antibodies, antibody derivatives, and vaccines (Stöger et al., 2005). In this paper, we show that large-sized spider silk multimers can be efficiently produced in seeds. We demonstrate stable inheritance and seed-specific expression over three or two generations, respectively, in transgenic lines in two different genetic backgrounds, *N. tabacum* cv. Samsun NN and *N. tabacum* cv. Petit Havana. Whereas *N. tabacum* cv. Samsun NN needs more time to start flowering, the recombinant protein accumulation level in seeds is better than in *N. tabacum* cv. Petit Havana according to the best-expressing plant or according to the general pattern of expressing lines. The seed ripening process itself is not influenced. The patterns of major seed proteins are not different between ripe wild type seeds and ripe transgenic seeds in Samsun NN as well as in Petit Havana (**Figure 5**). The USP promoter has been described as a seed-specific promoter (Bäumlein et al., 1991), but the expression analysis of transgenic tobacco plants by sensitive enzyme activity assays showed minor expression in several other organs and cells (Saalbach et al., 1994). Nevertheless, high mainly seed-specific expression has been shown for recombinant antibodies (Phillips et al., 1997; Floss et al., 2009). The USP promoter is continuously driving the accumulation of transgenic proteins from 10 to 28 DAF (Fiedler et al., 1993, 1997), but this should not negatively influence the final content of transgenic proteins in seeds. As shown in **Figures 1–3**, multimers larger than 460 kDa in size are routinely produced. This is essentially above the known native size of the FLAG protein (Ayoub et al., 2007). The positive influence of the molecular weight on the mechanical properties of plantproduced spider silk proteins has already been demonstrated (Hauptmann et al., 2013; Weichert et al., 2014). Such large-sized spider silk proteins can produce fibers and networks with better

mechanical properties by electrospinning as well as materials with superior properties for medical applications (Hauptmann et al., 2013, 2015). Two counteracting facts influence the documentation of this multimerization processes. On the one hand, the c-myc-tag is multimerized together with the FLAG protein, thus causing stronger signals in larger multimers. On the other hand, the efficiency of the electrotransfer process decreases with increasing molecular weight, especially above 200 kDa. The accumulation of FLAG multimers of maximally 190 μg/g fresh weight we roughly estimated fits well into accumulation levels reported for seed expression as 160 μg/g fresh weight for recombinant antibodies in barley seeds (Hensel et al., 2015), 46 μg/g fresh weight for recombinant antibodies in rice grains (Vamvaka et al., 2015) and 6.9 μg/g fresh weight for recombinant antibodies in tobacco seeds (Floss et al., 2009). Leaf expression of a recombinant antibody in tobacco driven by a ubiquitous promoter at optimized growth conditions was about 45 μg/g fresh weight (Sack et al., 2015). Larger scale production of seeds, at best in protein-rich seeds such as legumes, combined with the development of a suitable downstream process can provide enough material for the directed enrichment of fractions above 200 kDa, reflecting the native size. The process of *trans*-splicing requires reassociation of the intein fragments before splicing occurs (Topilina and Mills, 2014). Whereas the intein-based self-excision and ligation is expected to occur immediately after the translation and folding (Aranko et al., 2014), the reassociation may need more time, and we expect a concentration-dependence of this process. In addition, exteins can chemically or structurally influence the active site of inteins (Eryilmaz et al., 2014). During the formation of the multimers, several reassociation and splicing events occur on the same protein chain but not necessarily at the same time. This may explain the smear of proteins with slightly differing molecular weights at 18 DAF (**Figure 3**). At 21 DAF, distinct bands occur, but the expected pattern of multimerization is only visible in ripe seeds. In the faster developing Petit Havana seeds at 21 DAF bands corresponding to monomers, dimers and trimers were identified (**Figure 3**). Generally, the accumulation level per seed is much lower at 18 and 21 DAF than in ripe seeds. In pre-experiments, the construct was transiently expressed in *N. benthamiana* by the co-expression of a seed-specific transcription factor (FUSCA 3) binding to elements in the USP promoter (Mönke et al., 2004). Even 6 days after treatment with agrobacteria, strong expression, distinct bands and an expected pattern are visible (Supplementary Figure S1). These two observations are arguments that a certain transgene accumulation level is necessary for the intein-based protein splicing *in planta*. This level is provided by continuous promoter activity combined with stable accumulation in the ER provided by ER retention (Fiedler et al., 1997). ER retention has been proven for the accumulation of different spider silk proteins (Scheller et al., 2001; Hauptmann et al., 2013; Weichert et al., 2014). Yang et al. (2005) analyzed the accumulation of a synthetic spider silk dragline protein of 64 kDa in the apoplast, the vacuole and the ER lumen in *Arabidopsis* seeds and leaves. The highest accumulation levels have been reported for the ER lumen. The authors recommend seed-specific expression and ER targeting for plant-based spider silk protein expression as a result of their *Arabidopsis* experiments. We showed here, that this holds true also for spider silk multimers of native size in tobacco seeds. One of the goals of the experiments presented here was to test whether spider silk multimers in seeds are stable at room temperature without a decline in protein accumulation and without a change in the multimerization pattern. The data presented here show stability in the amount and multimerization pattern for 8 weeks storage at 15◦C and 49% humidity. In addition, long-term storage of T1 seeds for 1 year at these conditions resulted in a completely identical size distribution of the multimers and clear bands; thus, no indications of proteolysis were found. Further work should include the development of transgenic lines in protein-rich seeds. Here, the suitability of the USP promoter and ER retention has already been proven (Zimmermann et al., 2009). The high stability in seeds is a major advantage for the development of a suitable down-stream process.

#### AUTHOR CONTRIBUTIONS

Conceived and designed the experiments: NW, VH, and UC. Plasmid construction and transient tests: VH. Performed all other experiments: NW and CH. Analyzed the data: NW, CH, and UC. Wrote the paper: NW, VH, and UC.

#### FUNDING

This research was funded by grant FKZ 22037511 of the Fachagentur Nachwachsende Rohstoffe e. V., supported by the Federal Ministry of Food and Agriculture, Germany.

#### ACKNOWLEDGMENTS

For their excellent technical assistance, we kindly acknowledge the skillful technical support of Ingrid Pfort and Elisabeth

#### REFERENCES


Nagel and Heike Ernst for help with the photographic images. Further, we acknowledge all members of the International Society for Plant Molecular Farming (ISPMF) for inspiring us and the exciting discussions about the production of spider silks.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.00006


silk sequence of *Nephila clavipes*. *Biomacromolecules* 7, 1210–1214. doi: 10.1021/bm0600522


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Weichert, Hauptmann, Helmold and Conrad. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Transient Expression of Secretory IgA *In Planta* is Optimal Using a Multi-Gene Vector and may be Further Enhanced by Improving Joining Chain Incorporation

*Lotte B. Westerhof\*, Ruud H. P. Wilbers, Debbie R. van Raaij, Christina Z. van Wijk, Aska Goverse, Jaap Bakker and Arjen Schots*

*Laboratory of Nematology, Plant Science Group, Wageningen University, Wageningen, Netherlands*

#### *Edited by:*

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Stefan Schillberg, Fraunhofer IME, Germany Marcello Donini, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

*\*Correspondence:*

*Lotte B. Westerhof lotte.westerhof@wur.nl*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 13 October 2015 Accepted: 14 December 2015 Published: 11 January 2016*

#### *Citation:*

*Westerhof LB, Wilbers RHP, van Raaij DR, van Wijk CZ, Goverse A, Bakker J and Schots A (2016) Transient Expression of Secretory IgA In Planta is Optimal Using a Multi-Gene Vector and may be Further Enhanced by Improving Joining Chain Incorporation. Front. Plant Sci. 6:1200. doi: 10.3389/fpls.2015.01200*

Secretory IgA (sIgA) is a crucial antibody in host defense at mucosal surfaces. It is a promising antibody isotype in a variety of therapeutic settings such as passive vaccination and treatment of inflammatory disorders. However, heterologous production of this heteromultimeric protein complex is still suboptimal. The challenge is the coordinate expression of the four required polypeptides; the alpha heavy chain, the light chain, the joining chain, and part of the polymeric-Ig-receptor called the secretory component, in a 4:4:1:1 ratio. We evaluated the transient expression of three sIgAκ variants, harboring the heavy chain isotype α1, α2m1, or α2m2, of the clinical antibody Ustekinumab *in planta*. Ustekinumab is directed against the p40 subunit that is shared by the pro-inflammatory cytokines interleukin (IL)-12 and IL-23. A sIgA variant of this antibody may enable localized treatment of inflammatory bowel disease. Of the three different sIgA variants we obtained the highest yield with sIgA1κ reaching up to 373 μg sIgA/mg total soluble protein. The use of a multi-cassette vector containing all four expression cassettes was most efficient. However, not the expression strategy, but the incorporation of the joining chain turned out to be the limiting step for sIgA production. Our data demonstrate that transient expression *in planta* is suitable for the economic production of heteromultimeric protein complexes such as sIgA.

Keywords: secretory IgA, plant-based expression, Ustekinumab, *N*-glycosylation, heteromultimeric, protein complex assembly, co-infiltration, multi-gene vector

# INTRODUCTION

Secretory IgA (sIgA) is the predominant antibody type in mucosal secretions of the human body and plays a key role in the first line of defense against mucosal pathogens. While human serum IgA is primarily monomeric, B cells in mucosa-associated lymphoid tissues secrete IgA in a dimeric form through incorporation of the joining chain. Dimeric (d)IgA can bind the polymeric immunoglobulin receptor on the basolateral surface of epithelial cells where after the protein complex is transcytosed to the luminal side of the cell. Here the receptor is cleaved and a part of the receptor called the secretory component stays associated with the protein complex that is now referred to as secretory (s)IgA. Both dIgA and sIgA have immunological roles without development of inflammation. Mucosal antigens and/or pathogens can be bound by d/sIgA just before, during (e.g., intracellular pathogens in the epithelial cells) or after transcytosis and are thereby excluded from the mucosal tissue. On the luminal side of epithelial cells glycans on the secretory component facilitate binding to the mucus thereby enabling clearance of sIgAantigen/pathogen complexes. This concept of antigen/pathogen binding and clearance that does not lead to inflammation is referred to as immune exclusion and is believed not only to play a role in combating mucosal pathogens, but also in controlling commensal bacteria. Inflammation does not develop because there is limited presence of FcαRI positive immune cells in the mucosal area and sIgA has reduced affinity for FcαRI. In order to fulfill these roles the human body secretes 40–60 mg sIgA per kg body weight each day (Conley and Delacroix, 1987).

Two isotypes of IgA exist, IgA1 and IgA2 of which the latter occurs in two allotypes, IgA2m1 and IgA2m2. All three alpha heavy chains consist of one variable domain, three constant domains and an extended tailpiece allowing IgA to dimerise by incorporation of the joining chain. There are three major differences between these IgA variants. First, IgA1 has an extended hinge region. This extension is heavily *O*-glycosylated, which is suspected to play a conformational role (Narimatsu et al., 2010), allowing the binding of more distantly spaced antigens. Second, alpha heavy chains of isotypes 1 and 2m2 are covalently linked to their light chains via a disulphide bridge. No such linkage exists in IgA2m1, which allows the formation of a intermolecular disulphide bridge between the two light chains. Third, all variants differ from each other in the number of *N*-glycosylation sites they carry (2, 4, and 5 for IgA1, IgA2m1 and IgA2m2, respectively). The ratio wherein sIgA isotypes are present depends on the mucosal area, which in turn is most likely the result of the presence of specific bacteria as the IgA1 hinge region is sensitive to bacterial proteases (Vaerman et al., 1968; Stoop et al., 1969; Delacroix et al., 1982; Kerr, 1990).

The use of recombinant sIgA in passive mucosal immunotherapy in humans and livestock has been suggested as a good alternative for antibiotics. Several *in vivo* studies demonstrated (s)IgA's potential to locally control mucosal pathogens, such as *Mycobacterium tuberculosis* in the lungs (Williams et al., 2004), *Streptococcus mutans* in the mouth (Ma et al., 1998), and *Salmonella typhimurium*, *Vibrio cholera*, and *Cryptosporidium parvum* in the gastrointestinal tract (Enriquez and Riggs, 1998). When sIgA would be directed against proinflammatory cytokines and administered to the gut, it may relieve symptoms and induce remission in patients suffering from inflammatory bowel diseases (IBDs). Current treatment of IBD often includes systemic application of anti-TNF-α or anti-IL-12/23 antibodies. However, application of anti-TNF-α antibodies has been associated with the onset of tuberculosis (Keane et al., 2001). Non-systemic cytokine neutralization may reduce such infection risks and other side effects. Because sIgA is stable in the gut, sIgA-based therapy could be localized. Luminal administration of an anti-TNF-α antibody was effective in mouse models of IBD (Bhol et al., 2013) and local drug administration has been suggested to be important for efficacy of IBD therapies (Neurath, 2014).

Plants are a promising production platform for pharmaceutical proteins. As eukaryotes they are able to correctly fold complex proteins and assemble protein complexes such as virus like particles (Chen and Lai, 2013) and antibodies (De Muynck et al., 2010). Compared to mammalian cells, plants are a more economic production platform, as they do not require expensive cell culture conditions. Furthermore, the *N*-glycosylation pathway has been engineered to facilitate expression of antibodies with humanized *N*-glycans to allow effector functions (Bosch et al., 2013). Also, the engineering of mammalian mucin-type *O*-glycans has been achieved in plants (Castilho et al., 2012; Yang et al., 2012).

Plant-based expression of sIgA was achieved by stable transformation of the four individual genes required for sIgA assembly followed by crossing the highest producers (Ma et al., 1995; Paul et al., 2014). A drawback of this strategy is that it is a lengthy and laborious process. Transient expression is much faster and almost always results in higher yields, as there is no constraint on expression imposed by the site of insertion in the plant genome. To achieve transient expression of more than one gene simultaneously, *Agrobacterium* cultures harboring vectors for the expression of the individual genes need to be co-infiltrated or a multi-cassette vector facilitating expression of all genes should be used. The risk with coinfiltration is that cells may not be transformed with all genes, but use of a multi-cassette vector may be impractical if expression of the individual proteins needs to be adjusted to reflect the stoichiometry of the protein complex. Transient expression of chicken sIgA was achieved by co-infiltration (Wieland et al., 2006) and human sIgA was transiently expressed with a multi-cassette vector system (Juarez et al., 2013; Paul et al., 2014). While above-mentioned studies achieved sIgA expression, they also showed the presence of a large proportion of monomeric IgA as well as other assembly intermediates. The presence of these assembly intermediates complicates downstream processing.

The objective of this study was to evaluate the plant-based expression and assembly of three sIgA variants of the clinical antibody Ustekinumab (CNTO1275) to unravel limitations in sIgA assembly. This antibody has specificity for the p40 subunit shared by interleukin-12 and interleukin-23 and may be used in IBD therapy. Changing the backbone from IgG to sIgA may enable local administration. First, we evaluated the transient expression of all individual genes required for sIgA assembly whereby the three alpha heavy chain types 1, 2m1, and 2m2 were included. Next we compared co-expression with the use of multi-cassette vectors. The use of a multicassette vector including all genes increased sIgA expression threefold and decreased the presence of the intermediate dIgA compared to co-infiltration. However, sIgA expression may be further optimized, because we conclude that inefficient incorporation of the joining chain limits sIgA assembly. This maybe a consequence of inefficient *N*-glycosylation of the IgA tailpiece and/or joining chain. Improved *N*-glycosylation may be the key to enhance sIgA assembly en boost yield even further.

# EXPERIMENTAL PROCEDURES

# Construct Design

GeneArt (Thermo Fisher, Bleiswijk, The Netherlands) synthesized all below-mentioned gene fragments except the constant region of human immunoglobulin alpha-2m1 (AAB59396.1), which was amplified from the human transcriptome library MegaMan (Agilent Technologies, Middelburg, The Netherlands). Before gene synthesis, undesired restriction sites were removed from the sequences of the constant domains of human immunoglobulin alpha-1 (AAC82528.1) and kappa (AAA59000.1) chains, joining chain (AK312014.1) and secretory component (codons 1–764 of the polymeric immunoglobulin receptor; AAB23176.1). To obtain the sequence for the constant domains of human immunoglobulin alpha-2m2 the sequence of human immunoglobulin alpha-2m1 was adapted (P93S, P102R, F279Y, D296E, L319M, V326I, and V335A) based on the amino acid sequence P01877 (Uniprot). The variable regions of the clinical antibody Ustekinumab (CNTO-1275) and the signal peptide of the *Arabidopsis thaliana* chitinase gene (AAM10081.1) were recoded from the amino acid sequence using codons preferred by *Nicotiana benthamiana*. For subsequent cloning and assembly of the alpha heavy chain genes, the gene fragments were flanked by the following restriction sites at the 5- - and 3- -end: NcoI-EagI, EagI-NheI, NheI-KpnI, for the signal peptide, the heavy chain variable and alpha heavy chain constant regions, respectively. For subsequent cloning and assembly of the kappa chain, the gene fragments were flanked by the following restriction sites at the 5- - and 3- -end: NcoI-EagI, EagI-BsiWI, BsiWI-KpnI for the signal peptide, the kappa chain variable and constant region, respectively. For subsequent cloning of the joining chain and secretory component the sequences were flanked by NcoI-KpnI at the 5- - and 3- -end, respectively. None of the restriction sites used introduced extra amino acids except NcoI, which in some cases introduced an extra alanine after the start methionine. Genes were ligated into the shuttle vector pRAPa, a pRAP (or pUCAP35S) derivative (van Engelen et al., 1994) modified to include an AsiSI restriction site by introduction of the selfannealed oligo 5- - AGCTGGCGATCGCC -3 into a HindIII linearized pRAP. In pRAPa all open reading frames are placed under the control of the 35S promoter of the *Cauliflower mosaic virus* with duplicated enhancer (d35S) and the *Agrobacterium tumefaciens* nopaline synthase transcription terminator (Tnos). A 5 leader sequence of the Alfalfa mosaic virus RNA 4 (AlMV) is also included between the promoter and gene to boost translation. From pRAPa the expression cassettes were digested with AscI and PacI, confirmed by sequencing, and ligated into the expression vector pHYG (Westerhof et al., 2012). Use of the restriction sites AscI and AsiSI allowed subsequent introduction of expression cassettes as AsiSI creates the same overhang as PacI (**Figure 1B**). Three multi-cassette vectors were generated; one that combined the alpha chain-1 and the light chain, one that combined the alpha chain-1, the light chain and the joining chain and one that combined the alpha chain-1, the light chain, the joining chain and the secretory component. Expression vectors

# Transient Plant Transformation

*Agrobacterium* clones were cultured overnight (o/n) at 28◦C in LB medium (10 g/l pepton 140, 5 g/l yeast extract, 10 g/l NaCl with pH 7.0) containing 50 μg/ml kanamycin and 20 μg/ml rifampicin. The optical density (OD) of the o/n cultures was measured at 600 nm and used to inoculate 50 ml of LB medium containing 200 μM acetosyringone and 50 μg/ml kanamycin with *x* μl of culture using the following formula: *x* = 80000/(1028∗OD). OD was measured again after 16 h and the bacterial cultures were centrifuged for 15 min at 2800 × *g*. The bacteria were resuspended in MMA infiltration medium (20 g/l sucrose, 5 g/l MS-salts, 1.95 g/l MES, pH 5.6) containing 200 μM acetosyringone. For co-expression of genes *Agrobacterium* cultures harboring expression vectors for individual gene expression were mixed prior to infiltration or a multi-cassette vector was used. The final OD of each *Agrobacterium* culture in an infiltration mix was 0.5 unless indicated otherwise. The total OD of the infiltration mix was kept the same within an experiment by using an *Agrobacterium* culture harboring an empty vector if needed. The Tomato bushy stunt virus (TBSV) silencing inhibitor p19 was always co-expressed (Voinnet et al., 2003). After 1–2 h incubation of the infiltration mix at room temperature, the two youngest fully expanded leaves of 5–6 weeks old *Nicotiana benthamiana* plants were infiltrated completely. Infiltration was performed by injecting the *Agrobacterium* suspension into a *Nicotiana benthamiana* leaf at the abaxial side using a needleless 1 ml syringe. Infiltrated plants were maintained in a controlled greenhouse compartment (UNIFARM, Wageningen) and infiltrated leaves were harvested at selected time points.

#### Total Soluble Protein Extraction

Leaf disks were taken from fully infiltrated leaves and immediately snap-frozen. Plant material was ground first in liquid nitrogen and then in 2 ml ice-cold extraction buffer [50 mM phosphate-buffered saline (PBS) pH = 7.4, 100 mM NaCl, 10 mM ethylenediaminetetraacetic acid (EDTA), 0.1% v/v Tween-20, 2% w/v immobilized polyvinylpolypyrrolidone (PVPP)] per g fresh weight using a TissueLyser II (Qiagen, Venlo, The Netherlands). Crude extract was clarified by centrifugation at 16.000 × *g* for 5 min at 4◦C.

#### IgA and sIgA Quantification

IgA and sIgA concentrations in crude extracts were determined by sandwich ELISA. ELISA plates (Greiner Bio One; Alphen aan den Rijn, The Netherlands) were coated o/n at 4◦C in a moist environment with goat polyclonal anti-human kappa antibody (Sigma–Aldrich; Zwijndrecht, The Netherlands) in coating buffer (eBioscience, Vienna, Austria). After this and each following step the plate was washed five times with 30 s intervals in PBST (1x PBS, 0.05% Tween-20) using an automatic plate washer model 1575 (BioRad; Veenendaal, The Netherlands). The plate was blocked with assay diluent (eBioscience) for 1 h at room temperature. Samples and a standard line were loaded in serial dilutions and incubated for 1 h at room temperature. For IgA determination recombinant human IgA1κ (InvivoGen; Toulouse, France) was used as a standard in a twofold dilution series from 100 to 0.31 ng/ml in assay diluent. For sIgA determination colostrum purified sIgA (Sigma–Aldrich) was used as a standard in a twofold dilution series from 1000 to 3.1 ng/ml in assay diluent. Hereafter a HRP-conjugated goat polyclonal antibody directed against the constant domains of human IgA (Sigma–Aldrich) or a biotinylated goat polyclonal antibody directed against the human secretory component (Sigma–Aldrich) was used for detection of IgA and sIgA, respectively. Avidin-HRP conjugate (eBioscience) was used to bind the biotin of the antisecretory component antibody. 3,3- ,5,5- -Tetramethylbenzidine (TMB) substrate (eBioscience) was added and coloring reaction was stopped using 0.18 M sulphuric acid after 1–30 min. OD read outs were performed using the model 680 microplate reader (BioRad) at 450 nm with correction filter of 690 nm. For sample comparison the total soluble protein (TSP) concentration was determined using the BCA Protein Assay Kit (Pierce) according to supplier's protocol using bovine serum albumin (BSA) as a standard.

#### Protein Analysis by Western Blot

For western blot analysis clarified protein extracts were desalted using a Sephadex G25 (VWR International; Amsterdam, The Netherlands) column prior to BCA analysis. One microgram (unless otherwise indicated) of TSP was separated under reducing or non-reducing conditions by SDS-PAGE on in house made 6 or 12% Bis-Tris gels. Recombinant IgA1κ (InvivoGen) and/or colostrum purified sIgA (Sigma–Aldrich) or recombinant joining chain (Sino Biological; Cologne, Germany) were used as controls. Proteins were transferred to an InvitrolonTM PVDF membrane (Invitrogen) by a wet blotting procedure (Life technologies; Bleiswijk, The Netherlands). Thereafter the membrane was blocked in PBST-BL (PBS containing 0.1% v/v Tween-20 and 5% w/v non-fat dry milk powder) for 1 h at room temperature, followed by overnight incubation with a goat anti-human kappa (Sigma–Aldrich), goat anti-human immunoglobulin alpha (Sigma–Aldrich), goat anti-joining chain (Nordic Immunological Laboratories; Tilburg, The Netherlands) or goat anti-secretory component (Sigma–Aldrich) antibody. The membrane was washed five times in PBST (PBS containing 0.1% v/v Tween-20). There after a HRP-conjugated anti-goat IgG antibody (Jackson ImmunoResearch; Suffolk, UK) was incubated with the membrane where after washing steps were repeated. The SuperSignal West Dura substrate (Thermo Fisher Scientific; Etten-Leur, The Netherlands) was used for visualization. Pictures were taken using a G:BOX Chemi System device (SynGene; Cambridge, UK).

# RESULTS

#### Expression of Individual Components Required for sIgA Assembly

In order to achieve *in vivo* assembly of a heteromultimeric protein complex all required genes should be expressed simultaneously in the same cell and preferably at the right stoichiometry. The sIgA protein complex comprises four proteins. An overview of the individual proteins and the protein complexes IgA, dimeric (d)IgA and sIgA based on alpha heavy chain isotype 1 is given in **Figure 1A**. Expression cassettes and the combinations thereof to achieve expression of these protein complexes are given in **Figure 1B**. To evaluate the level and course of expression

FIGURE 1 | Overview of sIgA assembly and expression. (A) Individual proteins, alpha heavy chain 1 (α1), 2m1 (α2m1), or 2m2 (α2m2), kappa chain (κ), joining chain (JC), and secretory component (SC), and protein complexes IgA1κ, dimeric (d)IgA1κ and secretory (s)IgA1κ. (B) Expression cassettes with 35S promoter of the *Cauliflower mosaic virus* with duplicated enhancer (d35S), *Agrobacterium tumefaciens* nopaline synthase transcription terminator (Tnos), translational leader sequence of the Alfalfa mosaic virus RNA 4 (AlVL) and AscI, AsiSI and PacI restriction sites are indicated. Co-expression is required for IgA, dIgA, and sIgA assembly as indicated by the accolades.

of the individual genes we first expressed them individually and monitored expression over time [3, 6, and 9 days post infiltration (dpi)] using western blot analysis (**Figure 2**). For the alpha heavy chain the isotypes 1, 2m1, and 2m2 were evaluated. Upon expression of the alpha heavy chains and the kappa light chain bands were detected at the expected sizes that are assumed to represent the intact proteins. Furthermore, for all heavy chains several bands *>*100 kDa and a few faint bands *<*50 kDa were observed, which most likely represent multimers and products of proteolytic degradation, respectively. Upon expression of the secretory component a band was detected that migrates ∼10 kDa lower compared to the secretory component of the sIgA control. This may be explained by a difference in the number and/or type of *N*-glycans received by the secretory component when expressed in plants. The secretory component has seven confirmed glycosylation sites, but these may not (all) be glycosylated in plants. Furthermore, the most common *N*-glycans of plant-secreted proteins are 0.7–1.1 kDa smaller than most typical *N*-glycans found on human secretory component (Royle et al., 2003) which could already account for ∼10 kDa difference in protein size. Upon expression of the joining chain many bands were detected, but all migrate higher as expected for a single joining chain (15.6–17.1 kDa, depending on glycosylation) and most of them likely represent dimers/multimers. Also the *E. coli* produced recombinant joining chain (expected size ∼17 kDa) displays aberrant migration behavior and migrates around 20 kDa.

The course of expression of the individual proteins is similar and peaks at 6 dpi, except for alpha heavy chain 2m2, which peaks at 9 dpi. However, yields of the individual proteins vary. By comparing the band intensity of each individual protein to the recombinant controls we estimate yields between 1 and 5 μg/mg TSP for the alpha heavy chains, between 5 and 20 μg/mg TSP for the joining chain and between 50 and 200 μg/mg TSP for the kappa chain and the secretory component at dpi 6. Yield estimation of the heavy chains and joining chain is solely based on the intact monomeric proteins and is most likely underestimated due to the presence of multimers. Nonetheless, as the stoichiometric ratio between the heavy chain, the kappa chain, the joining chain and the secretory component is 4:4:1:1, we assume heavy chain expression to be the limiting factor for sIgA assembly if stabilization of individual proteins upon co-expression would not occur.

#### A Multi-Cassette Vector is Most Efficient for Transient Expression of sIgA

Co-expression in transient transformation can be achieved in two ways, either by co-infiltration of *Agrobacterium* cultures harboring a vector for each individual gene or by using a multicassette vector facilitating expression of all genes. Which strategy

would lead to the best co-ordinated expression is unclear. With co-infiltration it is possible that not all cells are transformed with all expression cassettes. The use of a multi-cassette vector would ensure that a transformed cell receives all genes, however, transformation may be less efficient due to the larger size of the T-DNA. Therefore we used both strategies and combinations thereof to express all genes needed for sIgA complex formation. We successfully constructed several multi-cassette vectors for expression of IgA1κ (alpha heavy chain 1 and kappa chain), dIgA1κ (alpha heavy chain 1, kappa chain and joining chain) and sIgA1κ (all four genes) whereby all genes are under control of the same promoter and terminator (**Figure 1B**). Subsequently, sIgA was expressed by co-infiltration of all genes individually (4 vector system), co-infiltration of the secretory component and the joining chain with the multi-cassette vector for IgA1κ expression (3-vector system), co-infiltration of the secretory component and the multi-cassette vector for dIgA1κ expression (2-vector system) and the infiltration of the multi-cassette vector for sIgA1κ expression (1-vector system; **Figure 3B**). **Figure 3A** shows the average sIgA yield of three biological replicates as determined by ELISA. To correct for the lower OD of the final *Agrobacterium* infiltration mix of the 3-, 2-, and 1-vector systems compared to the 4-vector system, an *Agrobacterium* culture carrying an empty vector (EV) was used to increase the OD (gray bars) or the concentration of the *Agrobacterium* carrying the multicassette vector was increased (dark gray bars). In both situations the yield is similar between the 4-, 3-, and 2-vector systems. Surprisingly, however, the use of the 1-vector system increases yield twofold to threefold. In the situation where the OD of the *Agrobacterium* cultures were supplemented with EV culture (gray bars), the twofold yield increase may be explained by an increased number of cells that receives all genes. In the case where we compensated the OD of the *Agrobacterium* cultures with cultures harboring the multi-cassette vector (dark gray bars) we were able to enhance sIgA yield 1.6-fold further. Use of a higher OD of an *Agrobacterium* culture often increases yield, as more T-DNA copies are transferred to the plant cell. Noteworthy is that despite the fact that in our 1-vector system the same promoter and terminator sequences were used to facilitate expression of all genes, loss of vector parts or loss of sIgA expression upon plant transformation was never observed. We therefore assume that our multi-cassette expression vectors are stable and recombination did not occur. These data suggest that the use of a multi-cassette vector is the most efficient strategy for transient expression of heteromultimeric protein complexes.

## Co-Expression of Alpha Heavy Chain and Kappa Chain Stabilizes Both Proteins

To determine the limiting factor in sIgA assembly we first evaluated the efficiency of IgA assembly in the absence of the joining chain and secretory component. Thereto, we co-expressed the alpha heavy chains and the kappa chain using the dual-cassette expression vectors for all three IgA variants and compared it with the individual expression of the alpha heavy chains and kappa chains. Leaf extracts were analyzed by western blot under reducing and non-reducing conditions (**Figure 4**). Visualization of the alpha heavy chains and kappa chain under reducing conditions demonstrated that all proteins accumulate to a higher level upon coexpression. As the accumulation of the alpha heavy chains increases upon co-expression with the light chain the presence of a degradation product just above 25 kDa becomes clear. Considering the size of this degradation product it is most likely the result of cleavage in the hinge region. All three alpha heavy chains seem sensitive to proteolysis of the hinge region.

While analyzing co-expression of alpha heavy chains with kappa light chain under non-reducing conditions, a band around the expected size for IgA (∼150 kDa) was detected on blots either treated with anti-alpha heavy chain or antikappa chain specific antibodies. We therefore assumed that this band represents intact IgA complex. Next to the 150 kDa band, several bands migrating *>*250 kDa were detected. As these bands were also seen in the recombinant IgA control

we assume they represent dimers/multimers of IgA. Also, several bands *<*150 kDa were detected, which may represent assembly intermediates, degradation products and/or individual polypeptide chains. Proteolytic cleavage in the hinge region results in Fc and Fab fragments of the same size as an intact alpha heavy chain (∼50 kDa). A band of 50 kDa is clearly detected on both alpha heavy chain and the kappa chain specific blots and therefore most likely represents Fab fragments. Two bands only detected on the kappa chain specific blot just below 25 and 50 kDa most likely represent un-associated monomeric and dimeric kappa chains. Assuming no un-associated alpha heavy chain is present, we conclude that accumulation of the alpha heavy chain is the limiting factor for IgA yield, despite the fact that the alpha heavy chains stabilize upon co-expression with the kappa chain.

# IgA Dimerization is the Limiting Step in sIgA Assembly

Next we evaluated the efficiency of sIgA assembly. We used the multi-cassette vectors to express IgA, dIgA, and sIgA (**Figure 3B**) with all three alpha heavy chain variants. Leaf extracts were analyzed on western blot under non-reducing conditions (**Figure 5**). Upon dIgA1, dIgA2m1, and dIgA2m2 expression two bands around 300 and *>*420 kDa were detected on the joining chain specific blot (third panel from the top). These bands can also be seen on the alpha heavy chain and kappa chain specific blots (first and second panel, respectively). We assume that these bands represent dIgA and multimerized (d)IgA. The 150 kDa band representing monomeric IgA in the alpha heavy chain and kappa chain specific blots is still present upon coexpression of the joining chain. This implies that dimerization of IgA is not 100% efficient or that the expression of the joining chain is limiting. The presence of free joining chains was not detected.

Upon sIgA1 expression monomeric IgA (∼150 kDa) is still detected on the alpha heavy chain and kappa chain specific blots. Next to that, bands of *<*100, ∼220, ∼400 and *>*420 kDa were detected on the secretory component specific blot (bottom panel). Because the band *<*100 kDa was only detected on the secretory component specific blot, this band most likely represents free secretory component. The 220 kDa band was also detected on the alpha heavy chain and kappa chain specific blot, but not on the joining chain specific blot and therefore most likely represents secretory component associated with monomeric IgA. Both the ∼400 and *>*420 kDa bands were also detected on the alpha heavy chain, kappa chain and joining chain blot and therefore must represent sIgA and multimers of (d/s)IgA. Upon sIgA2m1 and sIgA2m2 expression only the band *<*100 kDa representing

free secretory component and the band *>*420 kDa representing multimeric sIgA was clearly distinguished. It may be that alpha heavy chains 2m1 and 2m2 are more inclined to multimerization. No dIgA was detected upon expression of all four genes using any of the alpha heavy chains. Apparently sIgA assembly is equally efficient for all three alpha heavy chains and the expression of the secretory component is not limiting. The latter can also be concluded by the ample presence of free secretory component in the sIgA samples.

Both IgA and sIgA yield were determined with a sandwich ELISA using an anti-light chain capture antibody and an antialpha chain or anti-secretory component detection antibody, respectively. Although we speak of IgA and sIgA yield, it should be stated that also IgA and sIgA intermediates may be detected. While yields of the IgA variants ranged between 10 and 40 μg/mg TSP, sIgA yield was at least ninefold higher for all variants (**Figure 6**). The higher sIgA yield, compared to IgA yield, may be explained by the fact that sIgA assembly prevents proteolytic degradation of IgA. However, sIgA yield may be somewhat overestimated due to the presence of monomeric IgA associated with the secretory component.

These western blotting and ELISA results show that dimerization of IgA is the limiting step for sIgA assembly. Improving dIgA assembly could further increase sIgA yield and reduce the presence of assembly intermediates thereby simplifying down-stream processing.

## Joining Chain Incorporation is the Limiting Factor for sIgA Yield

Next, we investigated whether joining chain expression is the limiting factor for sIgA assembly. Thereto, we attempted to increase the expression of the joining chain by increasing the OD of the *Agrobacterium* culture harboring the vector for joining

chain expression using the 3-vector system. sIgA yield was determined by sandwich ELISA and joining chain expression was evaluated using western blot analysis (**Figures 7A,B**). While the expression of the joining chain increased by using a higher OD of the *Agrobacterium* culture harboring the vector for the joining chain expression, sIgA yield did not increase. We therefore assume that not joining chain expression, but its incorporation in the dIgA complex is the limiting factor for sIgA yield.

On a side note, the band that most likely represents monomeric joining chain (∼25 kDa) appears to be a doublet (two bands migrating very close to each other). A doublet may represent the same protein with a different number of *N*-glycans. Because the joining chain harbors only one *N*-glycosylation site, this doublet should represent non-glycosylated and glycosylated joining chain.

To evaluate if lowering IgA expression would influence the IgA:sIgA ratio, we also attempted to decrease the amount of IgA by reducing the OD of the *Agrobacterium* culture harboring the vector for IgA expression using the 3-vector system. Again sIgA yield was determined by sandwich ELISA and sIgA assembly

was evaluated using western blot analysis (**Figures 7C,D**). Upon reduction of the OD of the *Agrobacterium* culture harboring the vector for IgA expression the amount of monomeric IgA reduces, but does not disappear. Also sIgA yield is reduced when the OD of the *Agrobacterium* culture that harbors the vector for IgA expression becomes lower than 0.2. Thus, the IgA:sIgA ratio does not change by lowering the expression of IgA. In other words, sIgA assembly cannot be improved by adjusting the expression of its individual components. It is therefore likely that not the capacity of the plant cell to assemble the protein complex, but intrinsic properties of the individual proteins determine dIgA assembly efficiency.

Moreover we also observed a significant proportion of dIgA when using ODs of 0.4 and 0.5 for IgA expression. In the previous results section we concluded that secretory component expression and association with dIgA was not limiting, as we hardly observed dIgA for any of the sIgA variants upon secretory component co-expression (**Figure 5**). However, the experiment described in the previous section was performed with the 1 vector system and the experiment described in this section was performed with the 3-vector system. This suggests that a significant proportion of cells does not receive the secretory component expression cassette upon co-infiltration or that at least the expression of the secretory component may vary from cell to cell.

# DISCUSSION

We have studied the plant-based expression and assembly of three sIgA variants of the clinical antibody Ustekinumab (CNTO1275). We focussed on transient expression in *N. benthamiana*, as transient expression often yields more protein compared to stable transformation. Because sIgA is a heteromultimeric protein complex transient expression can be achieved in several ways. *Agrobacterium* cultures each harboring expression vectors that facilitate expression of the individual components can be co-infiltrated or a multi-cassette expression vector facilitating expression of the four components can be used. Also, a combination of these two strategies may be adopted. The risk with co-infiltration is that perhaps a proportion of the plant cells will not be transformed with all genes. This may result in the presence of un-associated components or assembly intermediates that may complicate downstream processing. Use of a multi-cassette vector would ensure that each transformed cell expresses each gene, however, the much larger T-DNA may be less efficiently transferred into the plant cells.

We co-infiltrated all genes individually (4-vector system), coinfiltrated the secretory component and joining chain with the multi-cassette vector for IgA1κ expression (3-vector system), co-infiltrated the secretory component and the multi-cassette vector for dIgA1κ expression (2-vector system) and finally also used a single multi-cassette vector for sIgA1κ expression (1 vector system). While sIgA yield was similar between the 4-, 3-, and 2-vector systems, surprisingly the use of the 1-vector system increased yield twofold. This yield increase may be explained by an increased number of cells that receive and express all genes. This hypothesis is supported by the fact that the presence of dimeric IgA was more dominant using the 3-vector system compared to the 1-vector system. This suggests that the secretory component is not expressed in all cells when using a separate vector for its expression. However, when transformation efficiency is a yield-limiting factor for sIgA, a yield increase would be expected every time the number of expression vectors is reduced. As mentioned, the 4-, 3-, and 2-vector systems yielded a similar amount of sIgA. Perhaps the T-DNA containing the gene for the secretory component is not efficiently transferred to plant cells, while all others reach most cells even when separate vectors are used. The T-DNA containing the secretory component gene is the largest of the four (joining chain 4.6 kbp, kappa chain 4.8 kbp, alpha heavy chain 5.6 kbp and the secretory component 5.9 kbp) and transformation efficiency has been demonstrated to go down with increased insert size (Frary and Hamilton, 2001). When the size of a T-DNA decreases transformation efficiency it would also reduce the efficiency by which the much larger T-DNA's of the multi-cassette vectors are transferred into plant cells (IgA 7.7 kpb, dIgA 9.6 kpb and sIgA 12.8 kpb). This means that the yield increase due to the fact that more cells receive all genes every time the number of expression vectors is reduced, is compensated by a reduced number of transformed cells. Only in the case of the 1 vectory system the yield increase is higher than the yield loss due to lower transformation efficiency because each transformed cell expresses all genes.

Next to ensuring that all genes are transferred to each transformed plant cell, the 1-vector system has another benefit. The 1-vector system also enables an increased copy number of each expression cassette. With the 4-vector system each *Agrobacterium* culture was used with an OD of 0.5 giving the final infiltration mix an OD of 2.5 (also including an *Agrobacterium* culture for expression of the viral silencing inhibitor p19 with an OD of 0.5), which we consider a maximum OD to allow efficient infiltration. With the 1-vector system an OD of 2.0 can be used until this maximum is reached. This means that the presence of each expression cassette is increased fourfold. While this may not results in a fourfold increase of each expression cassette in the plant cells due to a lower transformation efficiency of larger T-DNAs, it did increase sIgA yield 1.6-fold.

While the 1-vector system facilitated the highest yield of our protein complex, co-infiltration may still yield more protein complex if the expression of the individual genes does not reflect the stoichiometric ratio of the protein complex. If so, the expression of individual genes may be adjusted by controlling the concentration of the *Agrobacterium* cultures to best reflect the stoichiometric ratio of the protein complex. If free polypeptides are found, it can never be concluded whether or not these arise from unbalanced expression or are a consequence of partial transformation. Alternatively, use of promoters with different strengths could be used that enable the accumulation of the individual proteins in the stoichiometric ratio of sIgA and allow the use of a 1-vector system. While testing of different promoters may be very laborious, systems are arising that allow easy high throughput cloning, such as the golden gate system (Engler et al., 2009).

Even when using a single multi-cassette expression vector, accumulation of assembly intermediates still occurred, with monomeric IgA as the most predominant intermediate. This is in line with four other studies on expression of murine, chicken and human sIgA in plants that all report the accumulation of a significant proportion of monomeric IgA next to sIgA (Ma et al., 1995; Wieland et al., 2006; Juarez et al., 2013; Paul et al., 2014). Follow up studies on the expression of murine (s)IgA demonstrated that this antibody was targeted to the vacuole due to a cryptic targeting signal in the tailpiece of murine IgA (Frigerio et al., 2000; Hadlington et al., 2003). In our previous publication on expression of monomeric human IgA we demonstrate that also human IgA is poorly secreted from plant cells (Westerhof et al., 2014). The tailpieces of both human and chicken IgA contain similar sequences as the suggested cryptic targeting signal of murine IgA. Thus, it may well be that also chicken and human IgA are targeted to the vacuole. If IgA is targeted to the vacuole, it is possible that a proportion of IgA is transported to the vacuole before the joining chain can be incorporated. Unfortunately the tailpiece of IgA cannot be removed, as the penultimate cysteine residue forms a disulphide bond with a free cysteine of the joining chain (Atkin et al., 1996). An investigation whether mutations in the cryptic vacuolar targeting signal can abolish vacuolar targeting without influencing the complex assembly may provide a solution.

Reports on the expression of sIgA in Chinese hamster ovary (CHO) cells also identify dIgA assembly as the yield-limiting step in sIgA expression (Berdoz et al., 1999; Li et al., 2014). Because mammalian cells are devoid of vacuoles, vacuolar targeting cannot explain the lack of sIgA assembly. Unfortunately, both studies did not determine joining chain expression. Thus it is unclear if sIgA assembly in CHO cells is caused by limited joining chain expression or inefficient incorporation of the joining chain. To evaluate if either joining chain expression or incorporation was the limiting step for the yield of our sIgA we increased the bacterial OD of the *Agrobacterium* culture. Increasing the *Agrobacterium* concentration facilitated increased joining chain expression. Even though joining chain expression was increased, sIgA yield was not, nor had the proportion of monomeric IgA diminished. We therefore assume that joining chain incorporation was the limiting step for sIgA assembly.

We also observed that the band assumed to represent monomeric joining chain migrated as a doublet. Because the joining chain harbors only one *N*-glycosylation site it is possible that this doublet represents a non-glycosylated and a glycosylated version of the joining chain. It was demonstrated that incorporation of the joining chain is reduced if asparagine 48 of the joining chain or asparagine 549 of the alpha heavy chain is not glycosylated (Atkin et al., 1996; Krugmann et al., 1997). In a previous study, we already confirmed that the *N*-glycosylation of the asparagine 549 of the alpha heavy chain is partial (Westerhof et al., 2014). Partial *N*-glycosylation as a reason for inefficient incorporation of the joining chain coincides with the fact that also reduced expression of IgA did not alter the IgA:sIgA ratio. We therefore hypothesize that the capacity of the plant cell to assemble dIgA is not limiting. Partial *N*-glycosylation of IgA and/or joining chain explains inefficient sIgA assembly both for plant as well as CHO cell produced sIgA. Because *N*-glycosylation is co-translational limited access to the *N*-glycosylation site cannot explain inefficient *N*-glycosylation. However, in a largescale analysis of glycoproteins it was suggested that the sequence surrounding an *N*-glycosylation signal may influence *N*-glycosylation efficiency (Petrescu et al., 2004). Perhaps adaptation of the sequences surrounding the *N*-glycosylation sites in the joining chain and tailpiece of the alpha heavy chain can increase *N*-glycosylation efficiency. If this would increase sIgA assembly and reduce the presence of sIgA intermediates it would not only increase yield, but also simplify purification procedures.

Taken together our data suggests that plants most certainly allow the economic production of heteromultimeric protein complexes such as sIgA. The maximum yield of our sIgA1κ-Ustekinumab variant was 37% of TSP, which is well above the 1% commercial viability threshold. Hereby transient expression with use of a multi-cassette expression vector is the best strategy, because it ensures expression of all genes in all transformed cells. This prevents the occurrence of assembly intermediates due to partial transformation.

## AUTHOR CONTRIBUTIONS

LW – Has had the lead in this research project (concept and design), acquisition of data, analysis and interpretation of data, drafting and revising the article, and final approval of the version to be published.

RW – Substantial contributions to concept and design, acquisition of data, analysis and interpretation of data, and writing and revising the article critically for important intellectual content and final approval of the version to be published.

DvR – Acquisition of data.

CvW – Acquisition of data.

AG – Writing and revising the article critically for important intellectual content.

JB – Writing and revising the article critically for important intellectual content.

AS – Writing and revising the article critically for important intellectual content and final approval of the version to be published.

# FUNDING

This research was financially supported in part by Synthon (Nijmegen, The Netherlands) and a grant from the Dutch Ministry of Economic Affairs (PID07124).

# ACKNOWLEDGMENTS

We would like to thank Tim Warbroek, Aleksandra Syta, and Bob Engelen for their input in the experimental work and Gerard Rouwendal for his help in designing the multi-cassette vector system.

## REFERENCES


antibody and preventive immunotherapy in humans. *Nat. Med.* 4, 601–606. doi: 10.1038/nm0598-601


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer Marcello Donini and handling Editor Eugenio Benvenuto declared their shared affiliation, and the handling Editor states that, nevertheless, the process met the standards of a fair and objective review.

*Copyright © 2016 Westerhof, Wilbers, van Raaij, van Wijk, Goverse, Bakker and Schots. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A Plant-Produced Bacteriophage Tailspike Protein for the Control of *Salmonella*

*Sean Miletic1,2, David J. Simpson3, Christine M. Szymanski3, Michael K. Deyholos4 and Rima Menassa1,2\**

*<sup>1</sup> Southern Crop Protection and Food Research Centre, Agriculture and Agri-Food Canada, London, ON, Canada, <sup>2</sup> Department of Biology, University of Western Ontario, London, ON, Canada, <sup>3</sup> Alberta Glycomics Centre and Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada, <sup>4</sup> Department of Biology, University of British Columbia, Kelowna, BC, Canada*

The receptor binding domain of the tailspike protein Gp9 from the P22 bacteriophage was recently shown to reduce *Salmonella* colonization in the chicken gut. In this study, we transiently expressed the receptor binding domain of the Gp9 tailspike protein in *Nicotiana benthamiana*, and targeted it to the endoplasmic reticulum (ER) or to the chloroplasts. Gp9 was also fused to either an elastin-like polypeptide (ELP) or hydrophobin I tag, which were previously described to improve accumulation levels of recombinant proteins. The highest levels of recombinant protein accumulation occurred when unfused Gp9 was targeted to the ER. Lower levels of chloroplast-targeted Gp9 were also detected. ELP-fused Gp9 was purified and demonstrated to bind to *Salmonella enterica* serovar Typhimurium *in vitro*. Upon oral administration of lyophilized leaves expressing Gp9-ELP to newly hatched chickens, we found that this tailspike protein has the potential to be used as a therapeutic to control *Salmonella* contamination in chickens.

Keywords: bacteriophage tailspike protein, chickens, *Nicotiana benthamiana,* plant biotechnology, *Salmonella,* transient transformation

#### INTRODUCTION

The human intestine is a complex microbial ecosystem where hundreds of species of bacteria have adapted to live and grow. A mutualistic relationship has evolved benefiting the health of the host while providing an optimal habitat for microflora to thrive (Guarner and Malagelada, 2003). However, some of these bacteria are pathogenic, causing a wide array of intestinal pathologies. *Salmonella enterica* is a Gram-negative enteropathogenic bacterium that is widely prevalent and is one of the primary causes of foodborne illness in humans. There are roughly 1.4 million nontyphoidal salmonellosis cases each year in North America, causing approximately 25% of all hospitalizations due to foodborne illness (Mead et al., 1999). *S. enterica* serotype Typhimurium also referred to as *S.* Typhimurium, causes gastroenteritis characterized by diarrhea, vomiting, and abdominal pain and is showing the emergence of multidrug-resistant strains (Su et al., 2004;

#### *Edited by:*

*Edward Rybicki, University of Cape Town, South Africa*

#### *Reviewed by:*

*Biswapriya Biswavas Misra, University of Florida, USA Inga Isabel Hitzeroth, University of Cape Town, South Africa Anatoli Giritch, Nomad Bioscience GmbH, Germany*

#### *\*Correspondence: Rima Menassa rima.menassa@agr.gc.ca*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 15 October 2015 Accepted: 18 December 2015 Published: 08 January 2016*

#### *Citation:*

*Miletic S, Simpson DJ, Szymanski CM, Deyholos MK and Menassa R (2016) A Plant-Produced Bacteriophage Tailspike Protein for the Control of Salmonella. Front. Plant Sci. 6:1221. doi: 10.3389/fpls.2015.01221*

**Abbreviations:** BSA, bovine serum albumin*;* CFU, colony-forming units; ELP, synthetic elastin-like polypeptide; ER, endoplasmic reticulum; FLW, fresh leaf weight; GI tract, gastro-intestinal tract; HFBI, hydrophobin I from *Trichoderma reesei*; *S*. Typhimurium, *Salmonella enterica* serovar Typhimurium; *S*. Paratyphi, *Salmonella enterica* serovar Paratyphi; *S*. Enteritidis, *Salmonella enterica* serovar Enteriditis; *S.* Heidelberg, *Salmonella* Heidelberg; TSP, total soluble protein(s).

Chen et al., 2013). Poultry and eggs are a major source of infection, but other sources such as vegetables, fruits, nuts, sprouts, leafy greens, roots, and beans have been reported (Rodrigue et al., 1990; Hammack, 2012). In chickens, *Salmonella* is found throughout the intestinal tract (Fanelli et al., 1971) and the rupturing of intestinal contents during evisceration can readily contaminate poultry meat. For instance, *Salmonella* has been isolated from 33% of raw chicken breasts sampled from retail grocery stores in Ontario, Canada (Cook et al., 2012).

Antibiotic use has led to the emergence of antibioticresistant *Salmonella* strains. In 2013, 17% of typhoidal *Salmonella* isolates from Canadians were resistant to ciprofloxacin and 41% of *S.* Heidelberg infections were resistant to at least one antibiotic1 . This growing concern has provoked research into alternative methods for controlling bacterial outbreaks. Considerable research into using bacteriophage therapy to treat or prevent bacterial infections progressed in Eastern Europe and the former Soviet Union during the latter part of the 20th century and could potentially be reconsidered as a viable alternative to antibiotics (Sulakvelidze et al., 2001). Lytic bacteriophages are host-specific, self-replicating, and virtually non-toxic making them attractive alternatives to control bacteria such as *Salmonella* and bacteriophages have been shown to reduce *Salmonella* colonization in chickens (Goode et al., 2003; Atterbury et al., 2007). Despite these successes, this therapy is not without drawbacks. Bacteriophages are host-specific requiring diagnosis of the pathogen before the phage is administered (Waseh et al., 2010). Phages can also carry harmful genes and can potentially transfer these genes to the bacteria, increasing virulence (Skurnik and Strauch, 2006). As a result, there has been interest in the use of phage proteins such as endolysins (Roach and Donovan, 2015) as tools for the specific targeting of bacteria and the exploitation of phage receptor binding proteins for use in diagnostics and engineered phage-derived killing machines (Singh et al., 2012; Simpson et al., 2015). Unexpectedly, Waseh et al. (2010) have demonstrated that the P22 phage tailspike protein alone is effective in controlling *Salmonella* colonization and spread in chickens, presumably through its binding capability. These tailspike proteins are highly stable homotrimers that form the short tail of the bacteriophage and bind to the O-antigenic repeating units on the outer membrane lipopolysaccharide (Baxa et al., 1996). The tailspike protein Gp9 from the P22 bacteriophage can recognize several serovars of *Salmonella* including *S.* Typhimurium, *S*. Paratyphi A, and *S*. Enteritidis. A shortened version of Gp9 has been shown to agglutinate *S.* Typhimurium, inhibit bacterial motility and reduce colonization in the chicken gut (Waseh et al., 2010). Therefore, this protein has the potential to act as an effective pre-slaughter feed additive to reduce *Salmonella* contamination in chickens.

Plant bioreactors have been growing in acceptance as feasible production platforms for therapeutic proteins, as they are highly scalable and can be established with little upfront cost (Fischer et al., 2012). Protein drugs expressed in plant tissue are thought to be protected from digestive enzymes by the plant cell wall (Kwon and Daniell, 2015), and are especially useful for veterinary applications where regulations allow administration of unpurified or partially purified extracts (MacDonald et al., 2015). For example, leaf tissue can be harvested, lyophilized, and orally administered in capsules or suspended in a slurry removing costs associated with protein purification, administration, and cold-storage (Kolotilin et al., 2014). As higher eukaryotic organisms, plants can introduce posttranslational modifications required for complex recombinant proteins. Despite these benefits, recombinant protein yield remains a major factor limiting the widespread adoption of plant bioreactors for commercial protein production. Consequently, several approaches are currently being used to increase protein accumulation in plants. Proteins can be targeted to different subcellular compartments such as the ER, the chloroplasts, and the apoplast using signal and transit peptides (Conley et al., 2009b). This is because each subcellular compartment has a unique biochemical environment, protease content, and physical size which influence protein accumulation levels (Streatfield, 2007; Pillay et al., 2014). Additionally, peptide tags can be fused to recombinant protein to increase accumulation. For example, fusion tags such as ELPs and HFBI can increase recombinant protein accumulation levels, and have also been used to purify proteins from plant extracts (Conley et al., 2011).

The goal of this project was to transiently produce the truncated version of Gp9 in *Nicotiana benthamiana* by targeting the protein to the chloroplasts, to the ER, or to the ER fused with an ELP or HFBI tag. The activity of plant-produced Gp9 was then tested by examining its ability to bind to *S.* Typhimurium. Lastly, plant tissue containing Gp9 was orally administered to chickens inoculated with *S.* Typhimurium to determine if this plant produced therapeutic has the potential to limit *Salmonella* colonization.

#### MATERIALS AND METHODS

#### Gene Cloning

The truncated version (encoding amino acids 109–666) of the endorhamnosidase mutant of *gp9* (as described in Waseh et al., 2010) was codon-optimized for plant expression and synthesized by Biobasic Inc. (Markham, ON, Canada). *Gp9* was recombined into the previously constructed pCaMGate expression vectors using the LR reaction of Gateway-R technology (Thermo Fischer Scientific, Waltham, MA, USA), courtesy of Dr. Andrew Conley of Agriculture and Agri-Food Canada, London Ontario. Recombinant pCAMGate vectors were transformed in *Escherichia coli* XL1-Blue using the Gene Pulser II system (Bio-Rad Laboratories Inc., Hercules, CA, USA) and PCR screening using gene-specific primers (Forward primer: CGTTAGGTGTAGGTTTTGGTATGGATGGT, Reverse primer: CCGGCAACAGGATTCAATCTTAA) was conducted to screen for positive transformants containing the correct insert. Plasmid DNA was isolated from positive colonies and transformed into electro-competent *Agrobacterium tumefaciens* EHA105 cells. Electroporated *A. tumefaciens* cells were spread on yeast

<sup>1</sup>http://healthycanadians.gc.ca/alt/pdf/publications/drugs-products-medicamentsproduits/antibiotic-resistance-antibiotique/antimicrobial-surveillance-antimicro bioresistance-eng.pdf

extract broth (YEB) plates containing 50 µg/ml kanamycin and 10 µg/ml rifampicin and incubated for 2 days at 28◦C.

# Transient Expression in *N. benthamiana* Plants

Suspensions of *A. tumefaciens* carrying *Gp9* or *A. tumefaciens* carrying the post-transcriptional gene silencing suppressor *p19* from Cymbidium ringspot virus (Silhavy et al., 2002), were incubated overnight at 28◦C with shaking at 250 rpm until an optical density at 600 nm (OD600) of 0.5–1.0 was reached. Cultures were then centrifuged at 6000×*g* for 30 min and resuspended to an OD600 of 1.0 in Gamborg's solution containing 3.2 g/l Gamborg's B5 with vitamins, 20 g/l sucrose, 10 mM MES (pH 5.6), and 200 µM acetosyringone. Cultures were then incubated at room temperature with gentle agitation for 1 h. An equal volume of *A. tumefaciens* culture containing *Gp9* was combined with *A. tumefaciens* culture carrying *p19* and Gamborg's solution to give a total *A. tumefaciens* OD600 of approximately 0.67. These suspensions were used to infiltrate 7–8 week-old *N. benthamiana* plants grown in a growth room under 16 h light/8 h dark conditions at 21–22◦C with 55% humidity, and receiving roughly 100 µmol/photons m<sup>−</sup>2s−<sup>1</sup> of light. A 3 ml syringe was used to infiltrate the *A. tumefaciens* suspensions through the stomata of the abaxial leaf epidermis of *N. benthamiana*. After infiltration, plants were returned to the growth chamber for up to 6 days.

#### Tissue Collection and Protein Extraction

Four biological replicates were used in all experiments and consisted of four plants sampled as follows: two leaf disks/leaf (7 mm diameter) were collected from three infiltrated leaves of each plant and pooled. Tissue was flash frozen in liquid nitrogen and stored at −80◦C until use. For protein extraction, tissue was homogenized twice in 30 s pulses using a TissueLyser (Qiagen, Venlo, Netherlands) and TSP were extracted in 200 µl of plant extraction buffer (PEB) containing 1X phosphate-buffered saline (PBS), 0.1% (v/v) Tween-20, 2% (w/v) polyvinylpolypyrrolidone (PVPP), 100 mM ascorbic acid, 1 mM ethylenediaminetetraccetic acid (EDTA), 1 mM of phenylmethanesulfonylfluoride (PMSF) and 1 µg/ml leupeptin. TSP concentration for each sample was determined using the Bradford assay (Bradford, 1976). For electrophoresis under non-reducing conditions, protein samples were stored in a 5% (w/v) SDS, 250 mg/ml glycerol, 0.1 mg/ml bromophenol blue, 0.16 M Tris/HCl sample buffer (Seckler et al., 1989) to better visualize protein trimerization. Samples were frozen at −80◦C until use.

#### Western Blotting and Gel Staining

Pooled sample extracts and individual replicates were immunodetected against a standard curve of known amounts of purified Gp9-ELP to accurately quantify protein accumulation levels. Samples were either boiled for ten minutes or not boiled and loaded onto Bio-Rad Mini-Protean-R TGXTM Precast 4–20% (w/v) polyacrylamide gradient gels. Separated proteins were transferred to polyvinylidene difluoride (PVDF) membranes and blocked overnight in a 5% (w/v) skim milk powder in TBS-T (Tris-buffered saline-Tween 20) blocking solution. Membranes were incubated with a 1:5000 dilution of mouse anti-c-Myc antibody (Genscript, A00864, Piscataway, NJ, USA) or a 1:8000 dilution of rabbit polyclonal anti-Gp9 antibody (Kropinski et al., 2011). Membranes were washed and incubated with a 1:5000 dilution of goat anti-mouse or anti-rabbit secondary antibody conjugated with horseradish peroxidase (HRP), and visualized with the GE Healthcare Life Sciences (Little Chalfont, UK) ECL Prime Western Blotting Detection Reagent. Recombinant protein was quantified by image densitometry using Totallab TL100 software (Non-linear Dynamics, Durham, NC, USA). For staining of separated proteins, gels were washed for 5 min in water and stained for 1 h with GelCodeTM Blue stain reagent (Thermo Fischer Scientific, Waltham, MA, USA) at room temperature. Gels were then destained using three 5-min washes with water and imaged.

# Protein Purification

Gp9-ELP and Gp9-HFBI proteins were purified using a c-Myc tag purification kit from MBL International Corporation MBL (3305, Woburn, MA, USA) according to the manufacturer's instructions. Purified protein was stored at −80◦C until use. The *E. coli* produced His6-Gp9 was purified as described by Waseh et al. (2010).

# Gp9 and Gp9-ELP Adherence to *S.* Typhimurium

*Salmonella enterica* serovar Typhimurium (ATCC19585) was purchased from the American Type Culture Collection (Manassas, VA, USA) and grown under aerobic conditions at 37◦C on Lysogeny Broth (LB) agar plates. The strain used in the adherence assay was transformed with the pWM1007 plasmid (Miller et al., 2000) which expresses the green fluorescence protein (GFP) and grown on LB supplemented with 25 µg/ml kanamycin.

Five hundred nanograms of the *E. coli* produced His6-Gp9, BSA, or plant-produced Gp9-ELP were spotted onto hole punch sized pieces of nitrocellulose membranes and then blocked for 1 h in 5% (w/v) skim milk in PBS with 0.05% (v/v) Tween (PBS-T). The membranes were then probed with 10<sup>8</sup> cfu/ml of GFPexpressing *S.* Typhimurium in 5% skim milk PBS-T. The disks were washed three times for 5 min in PBS-T and were then placed onto LB agar plates with 25 µg/ml kanamycin and allowed to grow at room temperature overnight, followed by growth at 37◦C for 8 h. The disks were imaged with a FujiFilm (Tokyo, Japan) FLA-5000 system using the 473 nm laser at 400V for excitation and LPB (Y510) filter for emission. Fluorescence intensity was measured using the MultiGauge version 3.0 software.

#### Animal Studies

Animal studies were carried out in accordance with the protocol approved by the Animal Care and Use Committee at the University of Alberta following the procedure described by Waseh et al. (2010). Each group contained 5–8 SPF leghorn chickens (Poultry Research Facility, University of Alberta) that were provided with feed and water *ad libitum* and were randomly tested for the presence of *Salmonella* on the day of hatching by plating cloacal swabs onto selective Oxoid Brilliance *Salmonella* agar (Oxoid, ON, Canada). In all cases no *Salmonella* colonies were observed after 24 h of incubation at 37◦C. Chickens were orally gavaged with 300 µL PBS containing 107 colony forming units (CFUs) of *S*. Typhimurium the next day and then gavaged with 35 mg lyophilized and powdered leaves resuspended in 300 µl of PBS at 1, 18, and 42 h post-infection. The chickens were culled at 47 h post-infection and the collected cecal contents were serially diluted and plated onto Oxoid Brilliance *Salmonella* agar. *Salmonella* CFU were counted after the agar plates were incubated for 24 h at 37◦C.

#### Statistics

Minitab-R 17 statistical software (Minitab Ltd., Coventry, UK) was used to perform statistical analysis on the Gp9 accumulation data. A one-way analysis of variance (ANOVA) was performed with a Tukey test on the mean Gp9 accumulation levels for each day of the time course. *P <* 0.05 was considered significant. A two-tailed Student's *t*-test was performed on the data from the chicken experiment. *P <* 0.05 was once again considered significant.

#### RESULTS

# Gp9 Transient Expression in *N. benthamiana*

The *Gp9* tailspike gene was cloned into pCaMGate expression vectors (Pereira et al., 2014) targeting the ER, the ER fused with an ELP tag (ER-ELP), the ER fused with a HFBI tag (ER-HFBI), or the chloroplasts (**Figure 1**). Constructs were then agroinfiltrated along with the gene silencing suppressor, p19 (Silhavy et al., 2002), into the leaves of *N. benthamiana* plants. Plants were monitored over the course of 6 day(s) post-infiltration (dpi). Young, upper leaves infiltrated with the ER-targeted constructs turned a slight yellow–green color but otherwise the phenotypes remained unchanged. Leaf tissue samples were collected from 3 to 6 dpi and analyzed for Gp9 accumulation via Western blot using a c-Myc antibody specific for the C-terminal Myc peptide found in all constructs (**Figure 2A**). Gp9 targeted to the ER was faintly visible, running higher than predicted (**Table 1**), and no bands were visible for the chloroplast-targeted Gp9. Conversely, Gp9-ELP and Gp9-HFBI were readily detected, and faint bands were observed for both the ELP and HFBI fused constructs migrating above 150 kDa which could represent potential trimers. Smaller bands ranging in size between 20 and 50 kDa were also observed in the Gp9-ELP and Gp9-HFBI samples which most likely correspond to Gp9 degradation products since they are absent from the p19 negative control lane (**Figure 2A**). Interestingly, when immunoblots were probed with an anti-Gp9 antibody, a strong band slightly under 75 kDa was observed for ER-targeted Gp9, and a somewhat fainter band was seen in the chloroplast-targeted Gp9 (**Figure 2B**). Bands potentially representing dimers and trimers were also observed for all four proteins, as well as smaller bands that may represent Gp9 degradation products. It is interesting that the smaller bands observed in the Gp9-ELP and Gp9-HFBI are of a different size when detected with the two antibodies. The protein band migrating between 37.5 and 50 kDa may be a plant protein as it appears faintly in the p19 lane, while the band running between 50 and 75 kDa may represent a degradation product of Gp9.

This band is present in all samples, and it may represent the N-terminal portion of the protein detected with the Gp9-specific antibody, while the smaller band running between 20 and 37 kDa on the blot detected with the c-Myc antibody might represent the C-terminal portion of the protein. The low abundance of Gp9 on blots probed with the c-Myc antibody also imply that the c-Myc tag is cleaved off Gp9, or is inaccessible to the antibody.

# Gp9 Accumulation in *N. benthamiana*

from plants infiltrated with p19 serving as a negative control.

Since more protein was detected using the anti-Gp9-antibody, this antibody was used to accurately quantify Gp9 accumulation via densitometry analysis. For the purpose of protein quantitation, only bands representing the full length Gp9 monomer were quantified. Immunoblots using individual replicates revealed that Gp9 accumulation increases from the 3rd dpi, peaks on day 4 or 5, and subsequently decreases (**Figure 3**). ER-targeted Gp9 accumulated in significantly higher amounts than the other proteins reaching on average of 1.64 ± 0.09% of TSP on day 5. When recombinant protein accumulation is calculated in terms of micrograms of Gp9 per gram of FLW, Gp9 accumulates to an average of 235.01 ± 21.12 µg/g of



*Values are in kiloDaltons (kDa).*

FLW on day 4 (**Figure 3B**). The presence of either the ELP or HFBI tag appears to significantly decrease Gp9 accumulation on 3–5 dpi (*P <* 0.05). Gp9-ELP accumulates on average to 0.98 ± 0.05% of TSP on day 5 or 135.86 ± 25.56 µg/g of FLW on day 4, roughly 0.6 times or 1.7 times less, respectively, than when unfused. The presence of the HFBI tag caused Gp9 to accumulate in even lower amounts on average to 0.79 ± 0.09% of TSP (0.48 times less) or 116.49 ± 16.29 µg/g of FLW (two times less) on day 4. Gp9 accumulated significantly less when targeted to the chloroplasts compared to ER-targeted proteins on days 4 and 5 (*P <* 0.05). Accumulation reached an average of 0.21 ± 0.01% of TSP or 28.45 ± 1.65 µg/g of FLW on the 5th dpi.

# Gp9-ELP Purification and Characterization

Gp9 polypeptides are found in monomeric, dimeric, and trimeric intermediates before forming the stable, native trimer (Benton et al., 2002). The trimeric intermediate species or protrimer consists of associated subunits which have not completely folded forming a transient, less-stable precursor to the trimer

FIGURE 3 | Quantification of Gp9 accumulation in *N. benthamiana* over 3, 4, 5, and 6 dpi. Quantification was performed on TSP extracts from infiltrated *N. benthamiana* tissue using a standard curve of known amounts of purified Gp9-ELP. Gp9 was targeted to the ER, the chloroplasts, or the ER fused to an ELP tag (ER-ELP) or fused to a hydrophobin tag (ER-HFBI). Immunoblots were probed with an anti-Gp9 antibody. Accumulation levels of Gp9 are shown in µg per g of FLW. Error bars represent the standard error of the mean value of four biological replicates. Treatments which do not share a letter are significantly different (*P <* 0.05) as determined by a one way ANOVA and the Tukey test.

(Goldenberg and King, 1982). We successfully purified Gp9- ELP using a c-Myc-tag purification kit (**Figure 4A**) and decided to investigate if purified Gp9-ELP is present in either of these states by avoiding complete denaturation (unheated sample) and by avoiding reducing agents such as dithiothreitol (DTT) in the sample buffer to keep the disulfide bonds oxidized (unheated, no DTT). While most of the Gp9-ELP was found in the monomeric form when it was reduced and denatured, there was very little monomer present when heat denaturation was omitted. Instead, an intense band was observed below 250 kDa, which corresponds to the expected size for the 215 kDa trimer (**Figures 4A,B**). Higher banding was observed when samples were electrophoresed under non-reducing conditions (**Figure 4B**). Gp9 contains eight cysteine residues and has disulfide bonds while existing as a protrimer, despite the fact that all are reduced during conversion to the native trimeric state (Robinson and King, 1997). Consequently, bands running above 250 kDa could represent the protrimer intermediate running slower than the trimer. Generally, an oxidizing environment is needed for trimer folding due to the presence of the disulfide bonds in the protrimer, yet increasing concentrations of DTT increases the conversion of protrimer to trimer (Robinson and King, 1997). Even larger bands were visualized and could represent higher-order multimers or aggregates (Speed et al., 1997).

Our results suggest that Gp9 is present in the stable trimeric form when purified and is expected to remain functional to some degree.

#### Gp9-ELP Binding to *S.* Typhimurium

To determine if the plant produced Gp9 can bind to *S.* Typhimurium, 500 ng of purified Gp9-ELP was spotted onto a nitrocellulose membrane, blocked and then probed for 1 h with 108 cfu/ml of GFP expressing *S.* Typhimurium. BSA and *E. coli*-produced Gp9 were used as negative and positive controls, respectively. Gp9-ELP was able to bind *S.* Typhimurium at similar levels as the *E. coli-*produced Gp9 as measured by fluorescence (**Figure 5**) suggesting that the plant-produced protein is functional.

#### The Effect of Oral Administration of Gp9-ELP on *Salmonella* Colonization

Due to the demonstrated binding activity of Gp9-ELP, we investigated the effects of orally administering plant tissue containing Gp9-ELP on *Salmonella* colonization in chicks. The chicks were orally gavaged with 35 mg of resuspended plant tissue containing 7.7 µg Gp9-ELP 1, 18 and 42 h after gavaging with *S*. Typhimurium. Chicks were culled 47 h after bacterial infection and the *Salmonella* CFU in the cecal contents were enumerated (**Figure 6**). Gp9-ELP showed approximately 1-log reduction in *S*. Typhimurium colonization compared to the untreated control birds (*P* = 0.058). These results are promising considering plant leaves were fed directly to the birds without further purification, and each dose contained much less Gp9-ELP compared with the Waseh et al. (2010) study where a dose consisted of 30 µg of purified Gp9.

#### DISCUSSION

This study demonstrated that the receptor binding domain of the P22 bacteriophage tailspike protein, Gp9, could be transiently expressed in *N. benthamiana* at reasonable levels and is able to bind to *S.* Typhimurium. The results presented here suggest that the correct choice of antibody for protein detection is essential to accurately quantify recombinant protein accumulation. When probing with an antibody specific to the C-terminal Myc tag, Gp9 appeared to accumulate to the highest levels when fused to an ELP or HFBI tag, supporting the hypothesis of this study that these fusion tags would increase accumulation levels as previously reported (Patel et al., 2007; Conley et al., 2009a; Joensuu et al., 2010). These results were expected as ELP and HFBI tags promote the formation and distribution of protein bodies (PBs) which are thought to protect recombinant protein from hydrolysis and protease cleavage (Conley et al., 2011; Saberianfar et al., 2015).

However, probing with a Gp9-specific antibody gave contrasting results. When the Gp9-specific antibody was used for immunodetection, unfused ER-targeted Gp9 displayed about 50% more accumulation than the ELP and HFBI fusions. This result was unexpected, and suggests that Gp9 is stable in *N. benthamiana* and the additional ELP or HFBI tags do not increase protein stability further. Indeed, ELP tags have negligible effects on already high accumulating proteins such as GFP (Conley et al., 2009a). It appears that the addition of an ELP or HFBI tag actually decreases Gp9 accumulation when targeted for retrieval to the ER of *N. benthamiana* and this should be investigated in future studies. These results also imply that the c-Myc tag is partially cleaved or inaccessible causing us to underestimate Gp9 accumulation levels when detecting with an anti-c-Myc tag antibody. It is possible that the ELP and HFBI tags protect the c-Myc tag from cleavage, possibly through protein body formation, and therefore the fusion constructs were able to be detected with the c-Myc antibody. However, when probing with the Gp9 antibody the ER and chloroplast constructs produce a slightly higher banding pattern than expected if the c-Myc is indeed cleaved off (**Figure 2**; **Table 1**). It can be postulated then that these proteins may fold into a conformation that makes the c-Myc tag inaccessible. Regardless, as many recombinant proteins are designed to have tags for easy detection, these results are a cautionary note on the limits of short tags, and that perhaps both N- and C-terminal tags might be used to improve chances of accurate quantitation, while protein-specific antibodies are most probably the best option if available. Our constructs also contain an N-terminal Xpress tag (**Figure 1**), however, commercially available antibodies have not detected any recombinant protein in crude extracts (unpublished).

Gp9 also accumulated in the chloroplasts, but to a lower level than when targeted to the secretory pathway. Generally, chloroplast targeting is a promising strategy for increasing recombinant protein accumulation (Hyunjong et al., 2006). Nevertheless, some proteins do not accumulate well when targeted to these organelles. For example, one study described lower accumulation levels of zeolin, a chimeric storage protein, when targeted to the chloroplasts compared to the ER. A visualized degradation pattern of zeolin provided evidence for protein degradation in the chloroplasts (Bellucci et al., 2007). On the other hand, Oey et al. (2009) were able to express a phage lysin protein in tobacco chloroplasts with accumulation levels reaching 70% of TSP. This protein was produced by transforming the plastid genome and the gene was codon-optimized for plastid expression. The phage lytic protein was shown to be extremely stable in the plastids, which is understandable as phages have evolved substantial resistance to bacterial proteases. By analogy, the Gp9 should be stable in the plastids, and therefore it is possible that the protein is unable to efficiently translocate into the chloroplasts. Therefore, future work could focus on transforming the plastid genome of *N. tabacum* for high level, stable expression of Gp9 without having to rely on chloroplast protein import.

Like the full length Gp9, shortened versions lacking the amino acid N-terminus domain maintain the stability and enzymatic activity of the full length parent proteins (Miller et al., 1998). To the best of our knowledge, this is the first study to transiently express this truncated protein in plants. Previously, truncated tailspikes have been expressed in *E. coli*, isolated and shown to reduce *Salmonella* colonization in the chicken GI tract. It is believed that the tailspikes bind to *Salmonella* and retard its motility and binding capability (Waseh et al., 2010). Consequently, plant-produced Gp9 has the potential to serve as a therapeutic to control *Salmonella.* These proteins can potentially be added to chicken or other livestock feed, reducing contamination and reducing infections in humans. While the plant produced Gp9 did not yield statistically significant results in the chicken experiment, the results are indeed promising. We purified Gp9-ELP using a Myc column to show that Gp9-ELP exists as a trimer and remains active by binding to *S.* Typhimurium. However, it is still possible that the ELP tag could alter Gp9 conformation *in vivo*, influencing activity and future work could focus on unfused Gp9. *In vitro*, ELPs have been reported to reduce protein activity compared to non-fused proteins (Shamji et al., 2007; Kaldis et al., 2013). It is possible that the physiological temperature of chickens could influence the solubility of Gp9-ELP. This is because ELPs undergo a reversible phase transition from soluble to insoluble aggregates when heated past a certain transition temperature. However, this transition temperature can be manipulated by the altering the number and/or the residue composition of the peptide repeats in order to adjust the solubility of Gp9-ELP, thus allowing for better therapeutic efficacy at physiological temperatures (Hassouneh et al., 2012). Furthermore, it is likely that the 1 log reduction in *Salmonella* colonization observed in this study compared to the 2 log reduction in the Waseh et al. (2010) study is due to the lower Gp9 dose administered to the birds (30 µg of purified His6-Gp9 versus 7.7 µg Gp9- ELP in 35 mg of lyophilized plant tissue per dose) as well as the fact that the Gp9-ELP protein is contained within plant cells and thus the chicken's gut proteases need to digest through the plant cell wall in order to release the protein. In the previous study, it was shown that if the first dose was delivered at 18 h instead of 1 h post-infection, the benefits of the treatment were less. In our study, the protein would be released from the plant tissue sometime after the treatment is administered, so it is possible that this delay is partially responsible for the lower reduction in infection. Future studies will compare

#### REFERENCES


administration of higher levels of Gp9 protein and providing the feed throughout the experiment in order to more closely resemble the natural conditions in which this protein would be used.

One of our goals in this study was to demonstrate that Gp9 could be directly administered while in minimally processed plant tissue, without having to rely on taxing purification and formulation techniques. It can be postulated that purified Gp9 may better reduce *Salmonella* colonization in chickens as previously reported (Waseh et al., 2010). While a direct comparison between purified and non-purified Gp9-ELP was not done in this study, we have shown that wild type plant tissue alone is not significantly influencing *S*. Typhiumurium colonization in chicks. Therefore, while future work should mainly focus on increasing Gp9 accumulation in *N. benthamiana,* this study serves as another successful example of engineering plant factories for the production of a functional therapeutic.

#### AUTHOR CONTRIBUTIONS

SM and DS designed the research, performed the experiments, SM wrote the manuscript. RM, CS, and MD conceived the study, participated in its design and edited the manuscript.

#### ACKNOWLEDGMENTS

We would like to thank Angelo Kaldis, Hong Zhu, Bernadette Beadle, and Cory Wenzel for their technical support and Alex Molnar for help with the figures. This work was funded by Agriculture and Agri-Food Canada's A-base grant #1107, and the Natural Sciences and Engineering Research Council of Canada Strategic Grant #397260. CMS is an Alberta Innovates Technology Futures iCORE Strategic Chair in Bacterial Glycomics.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Miletic, Simpson, Szymanski, Deyholos and Menassa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Expression of the Multimeric and Highly Immunogenic *Brucella* spp. Lumazine Synthase Fused to Bovine Rotavirus VP8d as a Scaffold for Antigen Production in Tobacco Chloroplasts

*E. Federico Alfano1, Ezequiel M. Lentz1, Demian Bellido2, María J. Dus Santos2, Fernando A. Goldbaum3, Andrés Wigdorovitz2 and Fernando F. Bravo-Almonacid1,4\**

 *Laboratorio de Virología y Biotecnología Vegetal, INGEBI-CONICET, Ciudad Autónoma de Buenos Aires, Argentina, Instituto de Virología, CICV y A, INTA Castelar, Buenos Aires, Argentina, <sup>3</sup> Fundación Instituto Leloir e Instituto de Investigaciones Bioquímicas de Buenos Aires (IIBBA-CONICET), Ciudad Autónoma de Buenos Aires, Argentina, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal, Buenos Aires, Argentina*

#### *Edited by:*

*Edward Rybicki, University of Cape Town, South Africa*

#### *Reviewed by:*

*Kashmir Singh, Panjab University, India Shri Ram Yadav, University of Helsinki, Finland*

*\*Correspondence: Fernando F. Bravo-Almonacid fbravo@dna.uba.ar*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 15 October 2015 Accepted: 07 December 2015 Published: 23 December 2015*

#### *Citation:*

*Alfano EF, Lentz EM, Bellido D, Dus Santos MJ, Goldbaum FA, Wigdorovitz A and Bravo-Almonacid FF (2015) Expression of the Multimeric and Highly Immunogenic Brucella spp. Lumazine Synthase Fused to Bovine Rotavirus VP8d as a Scaffold for Antigen Production in Tobacco Chloroplasts. Front. Plant Sci. 6:1170. doi: 10.3389/fpls.2015.01170*

Lumazine synthase from *Brucella* spp. (BLS) is a highly immunogenic decameric protein which can accommodate foreign polypeptides or protein domains fused to its N-termini, markedly increasing their immunogenicity. The inner core domain (VP8d) of VP8 spike protein from bovine rotavirus is responsible for viral adhesion to sialic acid residues and infection. It also displays neutralizing epitopes, making it a good candidate for vaccination. In this work, the BLS scaffold was assessed for the first time in plants for recombinant vaccine development by N-terminally fusing BLS to VP8d and expressing the resulting fusion (BLSVP8d) in tobacco chloroplasts. Transplastomic plants were obtained and characterized by Southern, northern and western blot. BLSVP8d was highly expressed, representing 40% of total soluble protein (4.85 mg/g fresh tissue). BLSVP8d remained soluble and stable during all stages of plant development and even in lyophilized leaves stored at room temperature. Soluble protein extracts from fresh and lyophilized leaves were able to induce specific neutralizing IgY antibodies in a laying hen model. This work presents BLS as an interesting platform for highly immunogenic injectable, or even oral, subunit vaccines. Lyophilization of transplastomic leaves expressing stable antigenic fusions to BLS would further reduce costs and simplify downstream processing, purification and storage, allowing for more practical vaccines.

Keywords: bovine rotavirus, transplastomic plants, tobacco, BLS, VP8, vaccine, lyophilized, IgY

#### INTRODUCTION

Rotavirus is the main cause of severe gastroenteritis in newborn mammals and constitutes severe economic losses across the world (Saif and Fernandez, 1996). According to a ten year study spanning from 1994 to 2003, group A bovine rotavirus (BRV) is the main cause of neonatal diarrhea in calves in Argentina, being responsible for substantial economic constraints concerning beef and dairy herd (Garaicoechea et al., 2006). Calves are particularly susceptible to rotavirus infection during the first weeks of life and for that reason it becomes essential to count with an effective system of passive immunization comprising transmission of colostrum and milk antibodies originated from active vaccination of the mothers (Fernandez et al., 1998).

Bovine rotavirus is composed of 11 segments of doublestranded RNA enclosed in an inner core, surrounded by two protein layers that constitute the viral capsid. The outer layer is in turn formed by two different proteins, VP7 and VP4 (Estes et al., 2001). The latter possess hemagglutinating activity and participates in viral adsorption (Kalica et al., 1983; Crawford et al., 1994). VP4 is further cleaved into VP8 and VP5 by proteolytic enzymes in the gastrointestinal tract, being the non-glycosylated VP8 domain the one responsible for viral adhesion to sialic acid residues on the host cells. VP8 also displays neutralizing epitopes which makes it a good candidate for recombinant vaccines (Ruggeri and Greenberg, 1991).

Plant molecular farming offers a low-cost and easy to scale-up alternative for production of recombinant proteins. Plastid genome transformation offers interesting advantages over nuclear transformation for the development of commercial vaccines. Most importantly, the resulting transplastomic plants often present expression levels which are considerably higher than those typically obtained from nuclear transgenic plants. This is a consequence of the polyploidy of the chloroplast genome and the lack of silencing or position effects since the integration events proceed by homologous recombination (Bock, 2007). Furthermore, since chloroplast are maternally inherited in most crops, transplastomic plants imply a lower risk of horizontal gene transfer (Ruf et al., 2007). Particularly, plastid transformation has been extensively involved in the production of antigens, a large number of them are of interest for veterinary medicine (Ruiz et al., 2015).

Lyophilization of transplastomic leaves provides additional benefits which further reduce costs and simplify processing, purification, storage and immunization. It can further concentrate the recombinant products, which can be stable (preserving proper folding and disulfide bond formation) even after prolonged storage at room temperature, thus eliminating the need for cold chain and facilitating transportation (Su et al., 2015). Furthermore, lyophilization also removes microbial contamination from fresh leaves, making oral delivery with powdered lyophilized material in capsules safer (Kwon et al., 2013a).

The C486 BRV VP8∗ protein was previously expressed in tobacco chloroplasts, mostly as insoluble aggregates, and it has already been demonstrated that it conferred a strong immune response in female mice. Moreover, suckling mice born to immunized dams were protected against oral challenge with virulent rotavirus (Lentz et al., 2011).

The enzyme lumazine synthase from *Brucella* spp. (BLS) has been described as a novel protein carrier for antigen delivery based on its remarkable physicochemical and immunogenic properties. BLS folds as a highly stable dimer of pentamers, each subunit exposing a N-terminus which can accommodate foreign polypeptides or protein domains without affecting the folding and stability of the decamer as a whole (Zylberman et al., 2004). This characteristic was confirmed by structural analyses of the resulting chimeras. BLS overall structure and characteristics resemble those of the highly stable and immunogenic B subunit of the cholera toxin (CTB) and the *Escherichia coli* heat-labile enterotoxin subunit B (LTB), both of which have been broadly used as potent adjuvants to boost immunogenic response when coupled with foreign antigens. In fact, both adjuvants were already successfully expressed as antigenic fusions in transplastomic plants (Waheed et al., 2011; Lakshmi et al., 2013). BLS also behaves as a potent immunomodulator (Velikovsky et al., 2002; Berguer et al., 2006). It is capable of inducing both humoral and cellmediated responses (Velikovsky et al., 2003). It should be noted that BLS immunogenic properties only become evident when it is expressed as a fusion and not when it is coexpressed along with the antigen (Craig et al., 2005). Owing to its multivalence and immunogenic properties, BLS was successfully proven as an efficient carrier and adjuvant both for systemic and oral immunization (Rosas et al., 2006). More specifically, the N-terminal end of BLS was fused to the inner domain of the VP8 protein (VP8d) and expressed in *E. coli*. The resulting fusion (BLSVP8d) folded and assembled properly and elicited higher antibody titers than VP8d alone or a mixture of VP8d and BLS, both in female mice and in laying hen models. In the first case, suckling mice born to immunized dams were also protected against oral challenge with virulent rotavirus (Bellido et al., 2009), while immunoglobulin Y (IgY) antibodies against BLSVP8d produced in hens were able to fully protect mice against oral challenge with virulent BRV in a dose-dependent-manner (Bellido et al., 2012).

In this work, and for the first time, the BLS scaffold was assessed in a recombinant plant system. We expressed the BLSVP8d fusion in tobacco (*Nicotiana tabacum L. cv.* Petit Havana) chloroplasts to evaluate the possible use of this antigen for vaccine development. Its immunogenicity was assessed in a laying hen model. Our results demonstrate that BLSVP8d fusion readily accumulated, was highly stable without significant proteolytic degradation and was expressed in soluble form at very high levels in transplastomic leaves. Moreover, the fusion protein remained stable and could be detected in its soluble form in high levels in lyophilized leaves, even after one month storage at room temperature. Unpurified fresh and lyophilized leaf soluble protein extracts were able to induce specific neutralizing IgY antibodies in egg yolk. This work presents an interesting platform for a highly immunogenic injectable, or even oral, VP8∗ subunit vaccine. Our results also provide the basis for expression of other subset of antigens in transplastomic plants making use of the highly immunogenic BLS antigen scaffold. Furthermore, lyophilization of transplastomic leaves expressing stable antigenic fusions to BLS would facilitate the processing, purification and storage, reducing costs and allowing for more practical vaccines.

# MATERIALS AND METHODS

# Chloroplast Transformation Vector Construction

The DNA fragment coding for BLSVP8d, in which VP8d is fused to the N-termini of BLS, was obtained from pET-BLS-VP8d (Bellido et al., 2009) by PCR with *Pfu* DNA polymerase (Invitrogen, Carlsbad, USA) using primers BLSVP8dNdeI (5 CCCAT**ATG**CATGAACCAGTGCTTG 3 ) and BLSVP8dXbaI (5 CCTCTAGATCAGACAAGCGCGATGC 3 ). Primers included NdeI and XbaI restriction sites to allow further cloning. The amplification product was subcloned into pZErO-2 (Invitrogen Life Technologies, Carlsbad, SD, USA) and sequenced. BLSVP8d was released by enzymatic digestion with NdeI and XbaI, gel purified and cloned into chloroplast transformation vector pBSW-utr/hEGF (Wirth et al., 2006), which was previously digested with NdeI and XbaI to excise the hEGF sequence. The resulting construction was named pBSW-utr/BLSVP8d and carries the BLSVP8d sequence under the control of the promoter and 5 -untranslated region of the tobacco *psbA* gene (5 *psbA*) and downstream the *aadA* sequence that confers spectinomycin resistance, under the transcriptional control of the *rrn* promoter (Prrn). Flanking regions were included to allow homologous recombination with *N. tabacum* plastome (GenBank accession number NC 001879). The left flanking region (LFR) included the 3 region of *rrn16*, and the right flanking region (RFR) contained the full *trnI* and the 5 region of *trnA*, all from *N. tabacum*.

#### Chloroplast Transformation and Molecular Characterization of Transplastomic Plants

Chloroplast transformation was carried out as previously described (Svab et al., 1990), using a PDS 1000/He biolistic device (Bio-Rad, USA). Fully expanded leaves of *in vitro* cultured *N. tabacum* cv. Petit Havana plants were bombarded with 50 mg of 0.6 μm gold particles (Bio-Rad) coated with 2 μg of plasmid DNA, using 1,100 psi rupture disks (Bio-Rad). Transformed shoots were regenerated in selective RMOP regeneration medium containing 500 mg/l spectinomycin dihydro-chloride. To obtain homoplastic plants, leaves from PCR-positive shoots were cut into pieces and taken through two additional regeneration cycles in selective medium. After rooting, plants were transferred to soil and grown under greenhouse conditions. In the greenhouse, natural light was supplemented 16 h/day by sodium lamps providing 100–300 μmol s−<sup>1</sup> m−2, temperature was set at 26◦C during day and 19◦C in the night.

## PCR Analysis to Confirm Transplastomic State

DNA obtained from leaf material of spectinomycin resistant and wild-type (wt) plants was used as template for amplification with primers Cl Fw (5 -GTATCTGGGGAATAAGCATCGG-3 ) and Cl Rev (5 - CGATGACGCCAACTACCTCTG-3 ). Cl Fw hybridizes upstream of the LFR to 16S wt gene and Cl Rev hybridizes to the *aadA* sequence. Therefore, a 1450 bp fragment was only amplified from transplastomic plants.

The PCR reaction was conducted in a total volume of 50 μl, containing 10 ng of leaf total DNA, 10X PCR buffer, 400 μM dNTP mix, 150 ng of each primer, and 1 U *Taq* polymerase (Invitrogen Corp., Carlsbad, SD, USA). The reaction conditions were as follows: initial denaturalization (95◦C, 5 min) was followed by 30 amplification cycles (denaturing, 95◦C, 30 s; annealing, 55◦C, 60 s, and extension 72◦C, 90 s) and a 10 min final extension step at 72◦C.

## Southern Blot

Total DNA was extracted from leaves as described by Dellaporta et al. (1984). The DNA (2.5 μg) was digested overnight with NcoI enzyme (New England Biolabs, USA), electrophoresed in 0.8% agarose gels and blotted onto Hybond-N+ Nylon membranes (Amersham Biosciences, USA). Specific DNA sequences were detected by hybridization with α-32P-labeled *trnI/A* DNA probe. The probe was generated by random priming with a Prime-a-Gene kit (Promega, USA). Prehybridization and hybridization were carried out at 65◦C in Church's hybridization solution (Church and Gilbert, 1984) for 2 and 16 h, respectively. Membranes were washed twice with gentle shaking for 30 min in 0.2X SSC, 0.1% SDS at 65◦C. The blot was exposed to a storage phosphor screen, which was analyzed in a Storm 840 PhosphorImager system (Amersham).

#### Northern Blot

Total RNA was extracted from fully expanded young leaves using TRiZOL Reagent (Invitrogen Corp., Carlsbad, SD, USA). An aliquot of 5 μg of formaldehyde-denatured RNA was electrophoresed in a 1.5% agarose/formaldehyde gel and blotted onto Hybond-N+ Nylon membranes (Amersham Biosciences). Specific mRNA sequences were detected by hybridization with α-32P-labeled *bls* DNA probe generated by random priming with a Prime-a-Gene kit (Promega). The blot was prehybridized, hybridized and washed as described for Southern blot.

#### Total Protein Analysis

Total protein content from transformed and non-transformed plants was extracted by grinding 25 mg of fresh leaf material in 125 μl of Laemmli buffer. Different quantities of samples were electrophoresed in 12.5% SDS-PAGE gels for either Coomassie blue staining or western blotting. For western blot, proteins were transferred onto a nitrocellulose membrane. The membrane was probed with mouse antiserum against VP8 or BLS, followed by three washes with 50 mM Tris-HCl, pH 8, 150 mM NaCl, 0.05% Tween 20, and a second incubation step with an alkaline phosphatase-linked goat antimouse IgG antibody diluted to 1:2000. Finally, phosphatase activity was determined by a chromogenic reaction by addition of NBT/BCIP (Nitro blue tetrazolium/5-Bromo-4 chloro-3-indolyl phosphate; Sigma Chemical Co., USA) as substrates.

# BLSVP8d Solubility Analysis and Quantification

Total protein from 200 mg of fully expanded leaves from transformed and non-transformed plants was extracted in 1 ml of protein extraction buffer (50 mM Tris–HCl, pH 8, 30 mM NaCl). For lyophilized leaves, approximately 1/10 of leaf was added per ml of extraction buffer. Briefly, leaf material was ground in liquid nitrogen, mixed with extraction buffer and sonicated three times for 15 s with a microtip (Heat System Ultrasonic, Farmingdale, NY, USA) set at level 50%, and then centrifuged for 15 min at 10,000 × *g*, 4◦C. The supernatant and pellet fractions contained soluble and insoluble proteins, respectively. The pellet was washed three times with 1 ml water at 0◦C. Samples from each fraction (total, soluble, insoluble) were mixed with Laemmli buffer and were electrophoresed in 12.5% SDS-PAGE for either Coomassie blue staining or western blotting. Total soluble protein (TSP) content from supernatant fractions was quantified by Bradford assay using bovine serum albumin (BSA; Sigma Aldrich) as standard. For recombinant protein level quantification, gels were scanned and protein bands were quantified using Image J software (NIH, http://rsbweb*.*nih*.*gov/ij). For western blot, proteins were transferred onto nitrocellulose membrane. Membranes were treated as described before for total protein analysis.

#### Lyophilization

Fully expanded leaves from transformed and non-transformed plants were cut into small fragments measuring roughly 1 cm2. Fragments were frozen at −80◦C and then lyophilized in a Freezone Lyph-Lock 6 Freeze Dry System (Labconco) in vacuum (0.13 mBar) for at least 1 day at −40◦C. Lyophilized samples were stored at −80◦C or at room temperature.

#### Animal Immunization

Soluble quantified extracts from either fresh or lyophilized leaves from transformed and non-transformed plants were used for immunization of groups of light brown laying hens (*n* = 2) seronegatives for BRV. Groups were inoculated with 3.83 μg of BLSVP8d of which 2 μg corresponded to VP8d. Recombinant fresh and lyophilized extracts were normalized with the corresponding non-transformed extracts. Hens received three intramuscular doses of 0.5 ml, on days 0, 30, and 75, consisting of Freund's incomplete adjuvant:antigen in a proportion of 50:50. Laying capacity and the sites of injection were checked for side effects. Eggs were collected on day 0 and then weekly, 15 days after the third immunization, on days 87, 94, and 103 (DPI).

Hens were obtained from the biotery of the CICVyA, INTA. Animal immunization, management and sample collection were conducted by trained personnel under the supervision of a veterinarian, in accordance with protocols approved by INTA's ethical committee of animal welfare.

#### Antibody Measurements by ELISA

The egg yolk purification protocol was adapted from Akita and Nakai (1993). Egg yolks were diluted with five volumes of distilled water, frozen at −80◦C, thawed on ice and centrifuged at 4◦C for 10 min at 8,000 × *g*. The supernatant was stored at 4◦C and used for IgY analysis. Titers of IgY against BLSVP8d were determined according to Bellido et al. (2012). Briefly, 1 μg of purified VP8d was directly adsorbed onto each well overnight in carbonate-bicarbonate buffer pH 9.6 and blocked with PBS-T, 5% normal horse serum for 1 h at 37◦C. Blocking buffer was discarded and fourfold dilutions, starting at a 1/20 dilution, of all IgY samples in blocking buffer were incubated for 1 h at 37◦C. Wells were then washed with PBS-T and incubated with horseradish peroxidase-labeled goat anti-chicken IgY (Sigma Chemical Co., USA). After thorough washing, the reaction was developed with ABTS/H2O2 system and stopped by addition of 5% SDS. Absorbance was measured at 405 nm (A405) in a Multiskan Ex, Labsystems Inc. The cut-off value of the assay was calculated as the mean specific optical density (OD) plus 3 standard deviations from IgY purified samples obtained from hens immunized with soluble wild-type plant extracts. A previously purified IgY against BLSVP8d expressed in *E. coli* was used as a positive control.

Titers were expressed as the reciprocal of the highest IgY dilution which presented an OD value above the cut-off value.

#### Viral Neutralization Assay

Virus-neutralizing titers were determined by a fluorescent focus reduction test as previously described (To et al., 1998). Briefly, serial diluted samples of IgY antibody fractions were mixed with an equal volume of C486 strain BRV in order to have 100 fluorescent focus-forming (FFU)/100 μL of mixture that was incubated for 1 h at 37◦C. Then they were inoculated onto MA-104 monolayers (four replicates) and were further incubated for 48 h at 37◦C. Plates were then fixed with 70% acetone and were developed using a fluorescein isothiocyanate-labeled anti-BRV polyclonal antibody derived from the hyperimmunization of a colostrumdeprived calf. Samples that generated *>*80% reduction of the infection rate were considered protective. The viral neutralization titer was calculated by the Reed and Muench (1938) method considering the highest sample dilution that resulted positive.

#### RESULTS

#### Production of Transplastomic *N. tabacum* Expressing BLSVP8d

Expression from chloroplast transformation vector pBSWutr/BLSVP8d (**Figures 1A,B**) was first assessed in *Escherichia coli* extracts since the prokaryotic protein synthesis machinery can recognize plastid transcriptional and translational elements. *E. coli* does not necessarily reflect the environment of the chloroplast stroma, nevertheless it provides a fast method for testing the final genetic construction. The recombinant protein accumulated to high levels in bacteria, being detectable both by SDS-PAGE followed by Coomassie blue staining and western blot (data not shown).

Chloroplasts from *in vitro* tobacco plants (*N. tabacum* L. cv. Petite Havana) were then transformed by particle bombardment using vector pBSW-utr/BLSVP8d. Several regenerating shoots were obtained and three of them representing independent plastid-transformed (transplastomic) lines were further confirmed by PCR (**Figure 1C**). These lines were subjected to two additional rounds of regeneration on spectinomycin supplemented media to achieve homoplasmy. Rooted plants were transferred to soil and further grown under greenhouse conditions. Surface sterilized seeds from these plants were germinated in spectinomycin containing media to confirm maternal inheritance of the transgenes and the absence of wild-type plastomes. Germinated plants were transferred to greenhouse for further analysis. Phenotypic comparison to wild-type plants revealed that they are indistinguishable from their non-transformed counterparts (Supplementary Figure S1).

#### Analysis of Transgene Integration and Homoplastic State

Southern blot was performed in order to evaluate stable integration into the plastome and homoplasmy of spectinomycin resistant lines germinated from seeds. For this purpose total leaf DNA was digested with NcoI, whose recognition sequence is absent in the integrated cassette of transplastomic plants and

which cuts both upstream and downstream of the integration site in the plastome. Therefore, bands of approximately 6.4 and 8.8 kbp are expected for wild-type and transplastomic plastomes, respectively, when using a 32P-labeled probe comprising a fragment from *trnI/trnA*. Southern blot analysis confirmed the integration for all the evaluated lines (**Figure 1D**).

#### Analysis of Transcripts

Northern blot was performed to evaluate the presence of transcripts containing the BLSVP8d sequence. Total leaf RNA was probed with a sequence homologous to *bls* and as a result three major transcripts could be detected in all transplastomic lines. The strong promoter *psbA* accounted for the smallest and most abundant monocistronic transcript, the *rrn* promoter included in the cassette controlling *aadA* and BLSVP8d expression was responsible for a bicistronic transcript and a larger transcript aroused from read-through transcription from an endogenous *rrn* promoter (**Figures 2A,B**). Expected transcripts were approximately 1.2 kb for the monocistron, 2.2 kb for the bicistron and 3.8 kbp for the larger one. This interpretation is supported by comparing the electrophoretic mobility of these three transcripts relative to the one of rRNA 25S and 16S (3.7 and 1.5 kb, respectively).

# BLSVP8d Expression and Stability in Fresh Leaves

Expression of BLSVP8d was assessed in total protein extracts of all transplastomic lines. An intense differential band of

approximately 35 kDa and compatible with BLSVP8d expected molecular weight was evident for the three independent lines and absent in the wild-type control after Coomassie blue staining of a SDS-PAGE. Its identity was further confirmed by western blot using an antiserum raised against a larger version of VP8d produced in *E. coli* (VP8∗) (Bellido et al., 2009). A total extract from a transplastomic plant expressing VP8∗ which was previously generated at our lab was included as a positive control (Lentz et al., 2011). Remarkably, the band corresponding to BLSVP8d was even more intense than that of Rubisco large subunit (RuBisCo; **Figures 2C,D**).

Age-related leaf protein content decline can be exploited to evaluate recombinant protein stability (Birch-Machin et al., 2004; Zhou et al., 2008). In order to evaluate BLSVP8d stability, total protein extracts from leaves along a transplastomic line were examined by Coomassie blue staining (**Figures 3A,B**). BLSVP8d fusion readily accumulated in younger leaves and it remained highly stable in mature and even in senescent leaves, without showing any important signs of proteolysis. The highest expression levels were observed in old leaves. In contrast, RuBisCo large subunit and the majority of other proteins were gradually degraded due to senescence mechanisms. BLSVP8d levels were similar to, or greater than, those of RuBisCo depending on the leaf selected (**Figures 3A,B**).

#### BLSVP8d Quantification in Fresh Leaf

A mature leaf was chosen, total protein content was determined by Bradford and BLSVP8d was quantified by densitometry against a BSA standard by SDS-PAGE followed by Coomassie blue staining. BLSVP8d expression accounted for, at least, 40% of TSP or 4.85 mg/g fresh tissue, which represents

roughly 10 times the previously observed levels for VP8∗ (**Figure 3C**).

#### Stability and Solubility in Fresh and Lyophilized Leaf Material

In a preliminary analysis, both VP8∗ and BLSVP8d proved to be insoluble when expressed from their respective chloroplast transformation vectors in *E. coli*. Since the majority of VP8∗ also accumulated as inclusion bodies in the chloroplast stroma and several sonication pulses were necessary in order to have it extracted in a saline buffer compatible with immunization studies (Lentz et al., 2011), BLSVP8d solubility in fresh leaves was assayed. Therefore, leaf extracts from wild-type, BLSVP8d and VP8∗ plants containing total (T), soluble (S), and insoluble (I) protein were compared by SDS-PAGE followed by Coomassie blue staining. As expected, VP8∗ was mostly insoluble. Surprisingly, BLSVP8d mainly remained soluble in fresh leaf chloroplasts (**Figure 4A**). Identity of recombinant protein bands was confirmed by western blot (not shown).

Extracts from lyophilized transplastomic and wild-type leaves were analyzed in parallel by SDS-PAGE followed by Coomassie blue staining. BLSVP8d fusion remained stable and soluble even after lyophilization (**Figure 4B**). Comparing these extracts to their respective total extracts from fresh leaves, extracts from lyophilized leaves presented more degradation of endogenous proteins, which meant intact BLSVP8d become slightly more concentrated after lyophilization. As a result, soluble BLSVP8d extracts from lyophilized leaves were also included in immunization assays.

The soluble and stable nature of BLSVP8d remained unaltered even after storage for one month at room temperature (**Figure 4C**).

#### Immune Response to Transplastomic BLSVP8d

Immunoglobulin Y antibodies specific to BLSVP8d could be detected by ELISA after immunization of hens with either fresh or lyophilized transplastomic soluble protein extracts. No specific antibodies could be detected after immunization with the respective wild-type extracts. Overall, the response was slightly faster for fresh extracts. Antibody titers reached 12800 and ranged between 3200 and 12800 for fresh and lyophilized extracts, respectively (**Figure 5A**).

A virus neutralization assay was performed to assess the ability of these specific IgY antibodies to prevent bovine rotavirus infection *in vitro*. The anti-BLSVP8d IgY antibodies were able to neutralize rotavirus infection, in accordance with the results previously obtained by ELISA. Titers were similar for IgY antibodies elicited after immunization with both fresh and lyophilized transplastomic soluble protein extracts, and once again the response was slightly faster in the case of fresh leaf extracts. In addition, no neutralizing activity could be detected for any of the IgY fractions obtained after immunization with the respective wild-type fresh or lyophilized soluble protein extracts (**Figure 5B**).

#### DISCUSSION

Plastid genome transformation constitutes an advantageous system for recombinant protein expression. Chloroplast molecular farming stands out for many reasons, being of particular interest the reduced manufacturing costs, the ease of scaling-up, the high expression levels that are typically obtained and which simplify and reduce the cost of immunization protocols, the absence of human pathogens or toxins, the post-translational modification capability, the disulfide bond formation and the facilitated biological containment since chloroplasts are not inherited in polen. In the last years, several studies have reported lyophilization of transplastomic leaves expressing vaccine antigens or biopharmaceuticals, further reducing costs and facilitating storage, processing and purification, which makes immunization protocols simpler and cost-effective (Rosales-Mendoza et al., 2009; Soria-Guerra et al., 2009; Manganelli et al., 2012; Cardona et al., 2013; Kwon et al., 2013a,b).

In this work, we report for the first time the production of transplastomic plants expressing a recombinant fusion of BLS, a highly immunogenic decameric protein from *Brucella* spp, to an antigen. We have recently succeeded in producing transplastomic tobacco which expressed BRV C486 VP8∗ protein, an interesting target for vaccine development since it is involved in rotavirus infectivity and neutralization. Based on our previous work, BLS was coupled to VP8d, a shorter fragment of VP8∗, and transplastomic BLSVP8d plants were obtained. Since properly folded BLSVP8d had already been produced in bacteria and its potent immunomodulatory properties had already been demonstrated in female mice and laying hen models, it was a good candidate to further analyze its immunogenic properties in a recombinant plant system (Bellido et al., 2009, 2012).

Our results show that BLSVP8d was highly expressed in soluble form in plants, accounting for 40% of TSP or 4.85 mg per gram of fresh leaf tissue, which is approximately one order above the level obtained for VP8∗ using the same transformation vector. In the same manner as VP8∗, BLSVP8d readily accumulated in young leaves and remained stable in mature and senescent leaves of transplastomic plants. Stable BLSVP8d become more evident with leaf age as the content of RuBisCo and other proteins declined, meaning older leaves had greater BLSVP8d expression than the one reported for an intermediate mature leaf. Similar pattern of degradation of protein content was also observed for lyophilized leaves, which could also contribute to additional concentration of intact BLSVP8d. Our group has previously shown the capability of transplastomic tobacco to produce large quantities of immunogenic proteins without any observable phenotypic effects. Such is the case of VP1 peptide from the foot and mouth disease virus (FMDV) fused to the β-glucuronidase enzyme, in which expression levels (51% of TSP)

lyophilized transplastomic soluble protein extracts. (A) Detection of specific IgY antibodies in egg yolk obtained from hens immunized with total soluble protein extracts. Titers are indicated both for fresh (dark bars) and lyophilized (light bars) samples. Animals received three doses containing 3.83 μg of BLSVP8d (2 μg corresponded to VP8d). As a control, groups of hens were also immunized with total soluble protein extracts obtained from fresh and lyophilized wild-type leaves. Purified IgY response was evaluated by ELISA. Titers were expressed as the reciprocal of the highest IgY dilution which gave an OD value above the mean values of the respective control samples plus 3 standard deviations. (B) Viral neutralization assay to assess the prevention of bovine rotavirus infection *in vitro* conferred by specific IgY antibodies. Titers were expressed by the Reed and Muench's (1938) method considering the highest dilution that resulted in an 80% reduction of fluorescent foci. Titers are indicated both for fresh (dark bars) and lyophilized (light bars) samples. An IgY fraction obtained by immunization of a hen with BLSVP8d previously produced in *E. coli* was included as control (white bar).

were clearly higher than those of RuBisCo large subunit, the most abundant leaf protein in a wild-type plant (Lentz et al., 2009). Transplastomic BLSVP8d levels were also similar to, or greater than, those of RuBisCo depending on the age of the analyzed leaf. The drastic increase in the expression levels of BLSVP8d compared to VP8∗ could not be solely attributed to the fusion to BLS, since VP8d is almost one third smaller having truncated amino and carboxi-termini. Nevertheless, C-terminal fusion to BLS did not appear to be unstabilizing at all in the chloroplast stroma and although proper folding was not established in plants, BLS fusion protein decamers could account for reduced proteolytic degradation and protection. Despite the high expression levels, wild-type and transplastomic BLSVP8d plants were phenotypically indistinguishable from each other.

Since plastids are evolutionarily derived from bacteria, and both share compatible transcriptional and translational machinery, expression analysis from the transformation vector was first assessed in *E. coli*. Nevertheless, accumulation and stability could not be accurately predicted in this prokaryotic model. BLSVP8d was highly expressed in bacteria but formed inclusion bodies. VP8∗ was also mainly expressed as inclusion bodies either in bacteria and chloroplasts. Interestingly, despite its higher expression levels, we demonstrated that transplastomic BLSVP8d always remained soluble, even in senescent or lyophilized leaves. Furthermore, storage of lyophilized material for up to one month at room temperature did not show any sign of degradation or altered solubility of BLSVP8d. The observed difference in solubility between VP8∗ and BLSVP8d could not be entirely associated with the addition of BLS since VP8∗ and VP8d have different length. However, by no means the multimeric nature of BLS was detrimental for the solubility of the fusion as a whole.

Our work also demonstrated that hen immunization with unpurified transplastomic BLSVP8d soluble extracts from both fresh and liophylized leaves induced IgY specific neutralizing antibodies in egg yolk, as measured by ELISA and *in vitro* neutralization assays. Immunization results could not be directly compared to those obtained for VP8∗ due to its difference in length, but all our previous results with BLSVP8d expressed in bacteria suggest that potentially enhanced immunogenicity should also be the case in plants. Currently ongoing research is aimed to ascertain this and to assess stability, solubility and proper folding of BLSVP8d multimers coming from lyophilized extracts after longer storage at room temperature. Moreover, dietary uptake of powdered egg yolk specific IgY antibodies could be exploited in the future as an approach to confer passive protection against rotavirus (Vega et al., 2012).

#### CONCLUSION

The BLS scaffold was assessed for the first time in plants. BLSVP8d was highly expressed in tobacco chloroplasts, remaining soluble and stable during all stages of plant development, even in senescent or lyophilized leaves. Furthermore, fresh and lyophilized leaf unpurified soluble extracts were able to induce specific neutralizing IgY antibodies in a laying hen model. This work presents BLS as an interesting platform for a plant-based highly immunogenic injectable, or even oral, VP8∗ subunit vaccine.

Our findings provide the basis for the expression of other subset of antigens fused to the potent immunomodulator BLS and suggest that lyophilization, without the need for additional lyoprotective components, of transplastomic leaves expressing antigenic fusions can further reduce costs and simplify downstream processing, purification and storage, allowing for more rational injectable or oral vaccines.

# AUTHOR CONTRIBUTIONS

EFA was responsible for the acquisition, analysis, and interpretation of the data, the drafting of the manuscript, the publishing approval and accountable for all aspects of the work. EML was responsible for design and critically revision of the work, the publishing approval, and accountable for all aspects of the work. DB was responsible for the acquisition, analysis, and interpretation of the data, the publishing approval and accountable for all aspects of the work. MJDS was responsible for design and critically revision of the work, the publishing approval and accountable for all aspects of the work. FAG was resposible for design and criticaly revision of the work, the publishing approval and accountable for all aspects of the work. AW was resposible for conception and drafting of the work, interpretation of the data, the publishing approval and accountable for all aspects of the work. FBA was resposible for conception and drafting of the work, interpretation of the data, the publishing approval and accountable for all aspects of the work.

#### ACKNOWLEDGMENTS

This work has been supported by grant PICT 2011-1761 from Agencia Nacional de Promoción Científica y Tecnológica. EFA is a fellow of Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET, Argentina). DB is a fellow of Vetanco S.A. FBA and FAG are research scientists of CONICET. MJDS and AW are research scientists of CONICET and INTA. The authors would like to thank María Eugenia Segretin for critical revision of the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal*.*frontiersin*.*org/article/10*.*3389/fpls*.*2015*.*01170

#### REFERENCES


used. *Infect. Immun.* 71, 5750–5755. doi: 10.1128/IAI.71.10.5750- 5755.2003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Alfano, Lentz, Bellido, Dus Santos, Goldbaum, Wigdorovitz and Bravo-Almonacid. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Application of a Scalable Plant Transient Gene Expression Platform for Malaria Vaccine Development

Holger Spiegel 1 †, Alexander Boes <sup>1</sup> \* † , Nadja Voepel <sup>1</sup> , Veronique Beiss <sup>1</sup> , Gueven Edgue<sup>1</sup> , Thomas Rademacher <sup>1</sup> , Markus Sack <sup>2</sup> , Stefan Schillberg<sup>1</sup> , Andreas Reimann<sup>1</sup> and Rainer Fischer 1, 2

<sup>1</sup> Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Aachen, Germany, <sup>2</sup> Institute for Molecular Biotechnology, RWTH Aachen University, Aachen, Germany

#### *Edited by:*

Edward Rybicki, University of Cape Town, South Africa

#### *Reviewed by:*

George Peter Lomonossoff, John Innes Centre, UK Udo Conrad, Leibniz-Institut Für Pflanzengenetik Und Kulturpflanzenforschung (IPK), Germany John D. Hamill, Deakin University, Australia

*\*Correspondence:*

Alexander Boes alexander.boes@ime.fraunhofer.de † Shared first author.

*Specialty section:* This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

*Received:* 02 October 2015 *Accepted:* 07 December 2015 *Published:* 23 December 2015

#### *Citation:*

Spiegel H, Boes A, Voepel N, Beiss V, Edgue G, Rademacher T, Sack M, Schillberg S, Reimann A and Fischer R (2015) Application of a Scalable Plant Transient Gene Expression Platform for Malaria Vaccine Development. Front. Plant Sci. 6:1169. doi: 10.3389/fpls.2015.01169 Despite decades of intensive research efforts there is currently no vaccine that provides sustained sterile immunity against malaria. In this context, a large number of targets from the different stages of the Plasmodium falciparum life cycle have been evaluated as vaccine candidates. None of these candidates has fulfilled expectations, and as long as we lack a single target that induces strain-transcending protective immune responses, combining key antigens from different life cycle stages seems to be the most promising route toward the development of efficacious malaria vaccines. After the identification of potential targets using approaches such as omics-based technology and reverse immunology, the rapid expression, purification, and characterization of these proteins, as well as the generation and analysis of fusion constructs combining different promising antigens or antigen domains before committing to expensive and time consuming clinical development, represents one of the bottlenecks in the vaccine development pipeline. The production of recombinant proteins by transient gene expression in plants is a robust and versatile alternative to cell-based microbial and eukaryotic production platforms. The transfection of plant tissues and/or whole plants using Agrobacterium tumefaciens offers a low technical entry barrier, low costs, and a high degree of flexibility embedded within a rapid and scalable workflow. Recombinant proteins can easily be targeted to different subcellular compartments according to their physicochemical requirements, including post-translational modifications, to ensure optimal yields of high quality product, and to support simple and economical downstream processing. Here, we demonstrate the use of a plant transient expression platform based on transfection with A. tumefaciens as essential component of a malaria vaccine development workflow involving screens for expression, solubility, and stability using fluorescent fusion proteins. Our results have been implemented for the evidence-based iterative design and expression of vaccine candidates combining suitable P. falciparum antigen domains. The antigens were also produced, purified, and characterized in further studies by taking advantage of the scalability of this platform.

Keywords: expression screening, heat stability, multi domain-fusion antigens, *Nicotiana benthamiana* plants, *Plasmodium falciparum*, red fluorescent protein

#### Spiegel et al. Transient Plant Expression-Based Vaccine Development

#### INTRODUCTION

Even though the first malaria vaccine, GSK′ s Plasmodium falciparum circumsporozoite protein (CSP)-based Mosquirix (RTS,S), (Wilby et al., 2012) is expected to enter the market within the next 6–12 months, there is still an urgent demand for a malaria vaccine that reliably delivers long lasting protection against infection, clinical manifestation and transmission of the disease. RTS′ S (Chattopadhyay et al., 2003; Richards et al., 2013; Penny et al., 2015; Reddy et al., 2015; RTS,S Clinical Trials Partnership, 2015) presents pre-erythrocytic stage epitopes and targets the parasite in an early phase of its life cycle within the human host. However, Mosquirix does not induce complete protection, and efficacy decreases rapidly over the first 24 months after immunization (RTS,S Clinical Trials Partnership et al., 2012; Moorthy et al., 2013). Various strategies must be considered to develop a better malaria vaccine, including the optimization of immune responses and immune memory against well-known and well-characterized targets like PfCSP, by testing different presentation strategies (e.g., viral vectored vaccines, virus-like particles), formulations (virsosomes), delivery, and immunization schemes, as well as the identification of alternative and/or additional antigens to target the blood and sexual stages of the parasite life cycle to overcome allelic diversity and also to prevent immune evasion.

Significant effort has been invested into the identification of a large number of potential malaria vaccine candidate antigens, domains, and epitopes, and several been taken through preclinical and even clinical development, but none of the candidates tested thus far provides long-lasting, strain-transcending sterile immunity. The available data indicate that the development of better malaria vaccines face two major challenges, i.e., the induction of a strong and long-lasting (durable) response, and the identification and combination of appropriate epitopes as targets to induce a neutralizing response.

Natural acquired immunity (NAI) against malaria (Doolan et al., 2009) is observed in individuals from holoendemic areas and requires several years and many (50–70) infections to develop. Its most important function seems to be the control of blood-stage parasitemia by IgG directed against merozoite surface proteins. NAI-mediated protection from episodes of clinical malaria rapidly ceases following reduced or diminished exposure to repeated infections. This highlights the importance of both the induction of strong and durable IgG responses as well as the identification of optimal target antigens.

The different stages of the parasite life cycle are characterized by sets of specific surface proteins (Druilhe et al., 1984), many of them involved in important functions such as host cell attachment and host cell invasion (Malpede and Tolia, 2014). Another feature of P. falciparum that complicates the development of sustainable protective immune responses is its ability to switch between different pathways of erythrocyte invasion, involving different sets of proteins from two invasion ligand families, i.e., erythrocyte-binding antigens (EBA) and reticulocyte-binding antigens (Rh) (Persson et al., 2008).

Many epidemiological studies involving donors from malaria endemic areas associate protective immunity with antibody responses against various, predominantly merozoite antigens (Osier et al., 2008) including Pf MSP1, PfAMA1, Pf MSP4, Pf MSP8, Pf EBA175, and Pf RIPR. Many of these proteins elicit in vitro parasite growth-inhibitory antibody responses after animal immunization experiments (Sim et al., 2001; Stowers et al., 2002; Moss et al., 2012), but despite substantial efforts (Richards et al., 2013; Penny et al., 2015; Reddy et al., 2015; RTS,S Clinical Trials Partnership, 2015) the roles of these proteins in protective immunity remains poorly understood, making it difficult to select rational targets for the design of an ultimate malaria vaccine candidate.

We therefore believe that one promising direction in malaria vaccine development is the generation and testing of single or multi-stage-specific, multi-domain constructs that combine promising antigens or antigen domains in the context of one or several fusion proteins, which should induce more diverse and ideally strain-transcendent immune responses against all three stages of the P. falciparum life cycle, and thus achieve a greater degree of protection than existing vaccine candidates. The successful design of such completely artificial fusion proteins that combine isolated protein domains will inevitably result in a completely new context for these elements and may affect both the expression level and stability. This is an unpredictable challenge that must be addressed empirically through iterative design cycles featuring cloning and expression experiments.

In addition to the classic molecular farming approaches (Fischer et al., 1999; Daniell et al., 2001; Fischer and Schillberg, 2006) using stable transgenic plants, different transient plant expression systems (Kapila et al., 1997; Gleba et al., 2005, 2007; Starkevic et al., 2015) are gaining more and more attention as platforms for the manufacture of pharmaceutical proteins such as antibodies (Hull et al., 2005; Huang et al., 2010) and enzymes for emergency use (Rosenberg et al., 2015) and as conventional regular therapeutics (Aviezer et al., 2009). These transient expression systems are robust and scalable, and have been used for the inexpensive and the time and cost-efficient generation and testing of expression constructs including for many antibodies and vaccine antigens from different pathogens including targeting Human immunodeficiency virus (Rosenberg et al., 2013), Ebola virus (Huang et al., 2010), Influenza virus (D'Aoust et al., 2010; Landry et al., 2010), Dengue virus (Kim et al., 2015), Yersinia pestis (Santi et al., 2006), and P. falciparum (Davoodi-Semiromi et al., 2010; Feller et al., 2012; Jones et al., 2013; Boes et al., 2014, 2015; Voepel et al., 2014; Beiss et al., 2015).

Here, we demonstrate the application of the classical Nicotiana benthamiana/A. tumefaciens transient expression system to accelerate the development of malaria vaccine candidates, using as case studies a pre-erythrocytic stage multi-domain candidate, a dual-stage multi-domain candidate, and a multi-stage multidomain candidate.

#### MATERIALS AND METHODS

#### Ethics Statement

The animal experiments were approved by the Landesamt für Natur, Umwelt und Verbraucherschutz Nordrhein-Westfalen (LANUV), reference number 8.87.−51.05.30.10.077. All animals received humane care according to the requirements of the German Tierschutzgesetz, §8 Abs. 1 and the Guide for the Care and Use of Laboratory Animals published by the National Institutes of Health.

# Bacteria, Plant Characterization, and Parasites

A. tumefaciens strain GV3101::pMP90RK [Gm<sup>R</sup> , Km<sup>R</sup> , Rif<sup>R</sup> ] (Koncz and Schell, 1986) and N. benthamiana plants were used for transient expression by agroinfiltration. According to Bally et al. N. benthamiana isolates vary in the RNA-dependent RNA polymerase 1 gene (Rdr1) and a 72 bp insertion leading to a truncated version of the enzyme has been reported. To verify the genotype in respect to the Rdr1 locus, we used PCR, to amplify the respective region from genomic DNA prepared from N. benthamiana (Accession number: AY574374) and Nicotiana tabacum L. Petit Havana cultivar SR1 (Accession number: AJ011576, control). Primers (Rdr1 forward primer: 5 ′ -GTTAACGTATCCAATCGGGTTCTGCG-3′ and Rdr1 reverse Primer: 5′ -CTGATTTGCCGAAAATCACCCATCC-3′ ) were designed to cover the region potentially containing the 72 bp insertion (found in the N. benthamiana isolates LAB, 16C and SA Bally et al., 2015) compatible to N. benthamiana as well as N. tabacum which does not carry the Rdr1 72 bp insertion. Fragment size analysis (Presentation 1 in Supplementary Material) revealed the insertion phenotype for the N. benthamiana used in our study. P. falciparum strain NF54 (MRA-1000, MR4, Manassas, USA) was used for the parasite assays.

#### Construction of the Plant Expression Vector

Synthetic genes or PCR products encoding selected antigens and antigen domains were inserted into pTRAkc-ERH at the NcoI/NotI sites. Subsequent stacking of additional domains was achieved by inserting EagI/NotI fragments following plasmid linearization with NotI. Red fluorescent protein (RFP) fusion genes were inserted into a NotI-linearized plasmid carrying the RFP cDNA using the same approach (**Figure 1**). Restriction enzymes were used according to the manufacturers' instructions. The sequences of the P. falciparum antigens are summarized in **Table 1**.

#### Transient Expression in *N. benthamiana*

The expression vectors were introduced separately into A. tumefaciens and were used for the infiltration of N. benthamiana plants as previously described by Feller et al. (2012).

## Extraction of Total Soluble Protein from Leaves and Heat Precipitation

For initial screening, two leaf discs (diameter 1 cm) were punched from infiltrated leaves, weighed, and extracted in 3 mL phosphate buffered saline (PBS) per gram of leaf material using an electropistil. Insoluble material was removed by centrifugation (16,000 g for 5 min) and the supernatants were used directly

for SDS-PAGE, western blot analysis, and the quantification of RFP fluorescence. An aliquot of the supernatant was heated (70◦C for 5 min) and analyzed in the same manner. For the purification of E5, total soluble protein was extracted from 230 g of vacuum-infiltrated leaves in a Waring blender using 3 mL of extraction buffer (PBS containing 500 mM NaCl, pH 7.4) per gram of leaf material. The pH of the crude extract was adjusted to pH 8.0 using NaOH and the extract was heated to 65◦C. Insoluble material was removed by a combination of centrifugation (14,000 g for 15 min) and filtration.

(Bevis and Glick, 2002); GOI, Gene of interest. The restriction sites used to insert the GOI into the plant expression vector are indicated; His6

sequence encoding the six histidine affinity purification tag; ER-retention signal: sequence encoding the SEKDEL ER-retention signal.

tag:

#### Protein Purification

The E5 recombinant protein was purified from heat-treated extracts first by immobilized metal affinity chromatography (IMAC) using chelating Sepharose (GE Healthcare) charged with nickel (NiSO4, 200 mM). Unbound proteins were washed away with PBS and bound proteins were eluted in a two-step gradient using PBS containing 10 and 250 mM imidazole, respectively. IMAC elution fractions containing the target protein E5 were desalted against 20 mM Tris (pH 7.5) and E5 was further purified by ion exchange chromatography


TABLE 1 | Overview of *P. falciparum* antigens and antigen domains.

For each antigen, the table provides the name, a short abbreviation, the plasma DB identifier, the amino acid sequence, the calculated single domain size, the calculated size of the corresponding RFP-fusion protein as well as the categorized fluorescence observed in intact leaves (+ + +, high fluorescence in leaves; ++, moderate fluorescence in leaves; +, low fluorescence in leaves) and categorized concentration (based on RFP fluorescence) in crude extracts (H, Accumulation > 150µg/g FLW; M, Accumulation between 25 and150µg/g FLW, and L, < 25µg/g FLW).

(IEX) using a prepacked HiTrap Q FF column. The E5 protein was eluted using a three-step gradient (20 mM Tris pH 7.5 with 240, 500 and 1000 mM NaCl, respectively). The buffer was exchanged against PBS overnight at 4◦C using a Spectra/Por membrane (MWCO 6000–8000 g/mol; SpectrumLabs). The samples were then concentrated using Vivaspin centrifugal concentrators (Sartorius-Stedim Biotech GmbH) and passed through a 0.22-µm sterile filter. The protein concentration was determined by UV spectroscopy at 280 nm and the recombinant protein was stored at 4 ◦C.

#### SDS-Page and Western Blot Analysis

SDS-PAGE and western blot analysis were carried out as previously described by Boes et al. (2011). The western blots were probed with rabbit anti His<sup>6</sup> (Jackson ImmunoResearch) and alkaline phosphatase (AP)-conjugated goat anti-rabbit serum (Jackson ImmunoResearch) were used for detection. The monoclonal antibody (mAb) 5.2 (MRA-94, MR4, Manassas, USA), which recognizes the first epithelial growth factor (EGF) like domain of PfMsp1-19, was used for dot-blot analysis and was detected using AP-conjugated goat anti-mouse serum (Jackson ImmunoResearch).

## Quantification of RFP

The RFP concentration in tobacco supernatants was measured as previously described by Buyel and Fischer (2012) with minor modifications. The standard curve was generated with purified RFP ranging from 0 to 100µg/mL and the concentration was determined by linear regression analysis.

# Binding of Soluble and Insoluble RFP-Fusion Proteins to Ni-NTA Magnetic Agarose Beads

To assess the solubility of RFP and selected RFP-fusion proteins, Ni2+-NTA magnetic agarose beads (Qiagen) were used in combination with fluorescence microscopy. Infiltrated leaves were homogenized as described above and 400µL PBS and 50µL of Ni-NTA magnetic agarose beads were added directly to the pulp. After incubation for 30 min with continuous shaking, the beads were collected with a magnet, washed intensively with PBS and analyzed under a fluorescence microscope (Leica DMI6000/AF6000).

### Mouse Immunization and Titer Determination

Three BALB/c mice 6–8 weeks of age (Taconic) were immunized intraperitoneally with 50µg E5 emulsified in Gerbu MM adjuvant (Gerbu Biotechnik GmbH) on days 1, 14, 28, and 42. On each day, 5-µL blood samples were collected from the tail vein, diluted 1:10 in PBS and stored at –20◦C. Fourteen days after the last immunization, the mice were narcotized, and blood was collected by cardiocentesis. After 1 h incubation at room temperature, the blood was centrifuged (200 g for 2 min at room temperature) and the serum was separated from the blood cells and stored at –20◦C. The titers were determined in relation to the respective pre-immune samples taken directly before immunization, as previously described by Voepel et al. (2014) using the full fusion antigen, as well as the single domain RFP fusion proteins (100 ng/well), to coat microtiter plates for analysis by direct ELISA.

#### Immunofluorescence Assays

Indirect immunofluorescence assays with E5-specific immune sera were carried out using the schizonts and macrogametes of P. falciparum strain NF54 as previously described by Boes et al. (2014), Voepel et al. (2014), Beiss et al. (2015).

#### RESULTS

#### Expression Screening of Single-Domain RFP-Fusion Proteins

To select suitable antigens and/or antigen domains for a multi-domain fusion vaccine from a panel of candidates, we generated N-terminal RFP-fusion constructs to easily investigate the accumulation, solubility, integrity, and heat stability of each protein by transient expression in N. benthamiana.

The initial selection of antigens and antigen domains was primarily based on the availability of epidemiological and/or experimental data indicating the immunological relevance or potential protective efficacy of the antigens or antigen domains. The heat treatment of clarified or crude plant extracts is a valuable tool to simplify the downstream purification of recombinant protein from plant material because this step removes the majority of host cell proteins (HCPs) including RuBisCo (Buyel et al., 2014) but this approach is only suitable for recombinant proteins that resist temperature-induced denaturation. In this approach we wanted to avoid the heat sensitivity we observed in previous experiments with larger, more complex, or unstructured P. falciparum proteins (data not shown). Therefore, we focused on compact and stable domains featuring many disulfide bridges, including EGF-like (Beeby et al., 2005) and thrombospondin type 1 repeat (TSR)-domains that are present in many P. falciparum surface proteins.

We duly selected 14 P. falciparum antigens or antigen domains (**Table 1**) from different developmental stages of the parasite (pre-erythrocytic stage PfCelTOS, PfCSP\_TSR, Pf TRAP\_TSR, and Pf SPATR; blood stage Pf Msp1-19\_EGF1, Pf Msp4\_EGF, Pf Msp8\_EGF1, Pf Msp8\_EGF2, Pf Msp10\_EGF1, Pf Msp10\_EGF2, Pf MTRAP\_TSR, and Pf Ripr\_EGF6; and sexual stage Pfs25 and Pfs230-C0). Synthetic genes encoding each antigen were codon optimized for N. benthamiana, cloned into the expression cassette of the vector pTRAkc-RFP-ERH (**Figure 1**), introduced into A. tumefaciens which were then used for transient expression in N. benthamiana by the syringe infiltration of single leaves from intact plants. The vectors pTRAkc-RFP-ERH and pTRA-RFP-ZenH, the latter encoding a RFP fusion with the maize γ-zein domain that is targeted to protein bodies (Hofbauer et al., 2014) were used as controls for insoluble aggregates. After incubation for 4–5 days, the leaves were harvested and RFP fluorescence was visualized using a simple red filter with a cold light source and a green excitation filter. As shown in **Figures 2A,B**, representative leaves under normal and fluorescent light display different degrees of RFP fluorescence for each of the constructs. A strong signal was seen in the leaves infiltrated with the controls ER-retarded RFP, and protein body targeted RFP (ZenH), as well as RFP-Pf Msp1-19\_EGF1 (19\_1), RFP-Pf Msp8\_EGF1

(8\_1), RFP-Pf Msp8\_EGF2 (8\_2), RFP-Pf TRAP\_TSR (TT), RFP-PfCSP\_TSR (CT), RFP-Pf Msp10\_EGF1 (10\_1), RFP-Pf Msp10\_EGF2 (10\_2), and RFP-Pf Msp4\_EGF1 (4). Moderate signals were observed for RFP-Pf Ripr\_EGF6 (R6), RFP-PfCelTOS (Ce), RFP-Pf MTRAP\_TSR (MT), RFP-Pf 230-C0 (C0), and RFP-Pfs25 (25). RFP-Pf SPATR\_TSR (ST) exhibited minimal RFP fluorescence (see also **Table 1,** column 7).

different RFP-fusion constructs under white light (A) and under green light (B).

The soluble RFP fusion proteins were quantified by extracting total soluble protein from leaf discs taken from infiltrated leaves (three biological replicates) and taking spectroscopic fluorescence measurements before and after heat treatment (5 min at 70◦C). The concentrations were determined relative to the RFP calibration curve as summarized in **Figure 3** and also **Table 1**(column 8). These results indicated that all the RFP-fusion proteins were heat stable because no significant reduction in the fluorescence signal was observed after heat treatment (**Figure 3,** red columns). A strong difference between the visual appearance (**Figure 2B**) and measured fluorescence values was observed for RFP-Pf Msp10\_EGF1, RFP-Pf Msp10\_EGF2, RFP-Pf TRAP\_TSR, and RFP-ZenH (**Figure 3**).

To investigate the possibility that insoluble aggregates may form during expression we analyzed six representative samples (non-infiltrated, RFP, RFP-ZenH, RFP-Pf Msp1-19\_EGF1, RFP-Pf Msp10\_EGF1, and RFP-Pf SPATR\_TSR) by microscopy after capturing the recombinant protein and/or aggregates from the crude extract on Ni-NTA magnetic agarose beads. The images (**Figure 4** left and right panel) clearly illustrate the differences between predominantly soluble proteins like the ER-targeted RFP and the RFP-Pf Msp1-19\_EGF1 fusion (even distribution of fluorescence signal), and insoluble or aggregated proteins (fluorescent berry-like structures) like the protein body targeted RFP-ZenH and apparently the RFP-Pf Msp10\_EGF1 fusion. For the low-yielding RFP-Pf SPATR\_TSR protein we observed the soluble phenotype without indication for insoluble aggregates.

Finally, the expression, solubility, integrity, and heat stability of all RFP-fusion proteins was analyzed by reducing SDS-PAGE and western blot (**Figures 5A,B**) using the supernatant of heat-treated soluble extracts. Most of the recombinant proteins migrated in the region corresponding to their expected molecular weight (**Table 1**). Both methods also revealed the presence of covalently linked dimers of the RFP-fusion proteins, an observation we have made with almost all RFP-fusion proteins we have generated and tested under comparable conditions. In accordance with these observations, the RFP-Pf SPATR\_TSR fusion protein (**Figures 5A,B**, lane 14) and also the insoluble RFP-Pf Msp10\_EGF1 fusion protein (**Figures 5A,B,** lane 7) could not be detected. For the RFP-PfCelTOS fusion protein (**Figure 5B**, lane 12) a western blot to detect the His<sup>6</sup> tag revealed the presence of small amounts of degradation products in the regions around 40 kDa (D3, **Figure 5B**) and 18 kDa (D1, D2, **Figure 5B**) potentially representing proteolytic cleavage. The band intensities in Coomassie stained gels and on western blots did not always correlate well (e.g., RFP-PfCSP\_TSR, **Figure 5B**, lane 12, and Pf 230-C0, **Figure 5B**, lane 16), possibly indicating differences in the accessibility of the His<sup>6</sup> tag for the different P. falciparum proteins. The pronounced difference between the in-gel and western blot signals representing Pf 230- C0 (**Figures 5A,B**, lane 16) may reflect the proteolytic cleavage of a C-terminal fragment comprising the His<sup>6</sup> tag (visible at >15 kDa on the western blot, D4, **Figure 5B**).

#### Construction of the Pre-erythrocytic Vaccine Candidate P3

Based on the screening results for protein domains fused to RFP, we selected PfCelTOS (Ce), PfCSP\_TSR (CT), and Pf TRAP\_TSR (TT) for the construction of a pre-erythocytic vaccine candidate. The pre-erythrocytic antigen Pf SPATR\_TSR was rejected due to its poor expression as a RFP-fusion protein, while the apparently poorly soluble Pf TRAP\_TSR (TT) was included because of its clinical relevance and hoping for improved solubility in the different context of a new fusion protein.

To investigate the integrity, expression, and heat stability of the domains subsequently combined in the context of a fusion protein, we generated three expression constructs (P1, P2, and P3, **Figure 6A**) and tested them by transient expression followed by heat precipitation, SDS-PAGE and western blot. As shown in **Figure 6B**, all three variants, including the artificial fusion proteins P2 and P3, could be produced at satisfactory levels (>100µg/g fresh leaf weight (FLW) based on our judgment of in-gel band intensities after Coomassie staining, according to our previous experiences) and remained in the soluble fraction

FIGURE 3 | Quantification of RFP-fusion proteins in plant extracts before and after heat treatment by fluorescence detection. The accumulation of RFP-fusion proteins was determined by fluorescence quantification compared to affinity-purified RFP-derived calibration curve. Values are expressed as mean of three biological replicates including standard deviations. Gray columns, crude extract; red columns, extract after heat treatment; lanes 3–16 contain the samples identified in Table 1. FLW, fresh leaf weight.

FIGURE 4 | Visualization of insoluble aggregate formation. Analysis of selected RFP-fusion proteins after purification from plant crude extracts using (Continued)

#### FIGURE 4 | Continued

Ni-NTA magnetic agarose beads, showing the difference between soluble and insoluble proteins. Wt, non-infiltrated leaf extract; RFP, RFP construct; 19\_1,RFP-PfMsp1-19\_EGF1 construct; ST, RFP-PfSPATR\_TSR; RFP-ZenH; 10\_1, RFP-PfMsp10\_EGF1. Left panel, Transmission image; right panel, fluorescence image.

after heat treatment. Bands corresponding to P1 (calculated size 19.3 kDa, observed size 19–20 kDa), P2 (calculated size 27.7 kDa, observed size: 32–34 kDa), and P3 (calculated size 33.9 kDa, observed size 38–40 kDa) were clearly represented on the stained gel and corresponding His<sup>6</sup> tag-specific western blot representing extracts after heat treatment, indicating that the combination of PfCelTOS with the TSR domains of PfCSP-TSR and Pf TRAP\_TSR within a fusion construct yields an intact and heat-stable protein that combines important complementary P. falciparum pre-erythrocytic antigens. For the fusion proteins P2 and P3, the western blot revealed the formation of reductioninsensitive higher-molecular-weight aggregates with an apparent molecular weight of the corresponding dimers (64–68 kDa for the P2 dimer and 76–80 kDa for the P3 dimer), whereas only a small amount of the C-terminal (His6-tagged) degradation product was observed at 11 kDa, suggesting that the partial proteolytic degradation of PfCelTOS as observed in the expression screening of single domain RFP-fusion proteins **(Figure 5)** does not occur to the same extent in the context of the P3 fusion protein.

#### Construction, Expression, Purification, and Characterization of the Multi-EGF Dual-Stage Vaccine Candidate E5

EGF-like domains are present in several P. falciparum surface proteins, predominantly on the merozoites and schizonts (blood stage) but also on the zygotes, macrogametes, and ookinetes (sexual stage). Whereas GPI-anchored merozoite surface proteins such as Pf Msp4 and Pf Msp5 feature single EGF-like domains, Pf Msp1, Pf Msp8, and Pf Msp10 feature a tandem array of two EGF-like domains, and the sexual stage antigen Pfs25 is composed of four EGF-like domains. The merozoite surface antigen Pf Ripr contains a total of 10 EGFlike domains arranged in a tandem followed by a separate cluster of eight EGF-like domains. It has been shown that EGF-like domains are the target of parasite growth inhibitory antibodies in humans (Egan et al., 1999; O'Donnell et al., 2001; Maskus et al., 2015) and/or immunized animals (Chappel and Holder, 1993; Chen et al., 2011) and improve the antigenicity of the antigens (Wang et al., 1999). The high expression levels and thermal stability of most of the EGF-like domain constructs in our panel prompted us to combine a number of such domains from P. falciparum merozoites with the sexual stage antigen Pfs25 in the context of a multi-EGF dual-stage vaccine candidate antigen. Taking into account the expression levels and solubility data obtained in the RFP-fusion expression screening experiments, we omitted Pf Msp10\_EGF1/2. Using a domain stacking approach, a series of five sequentially elongated constructs (E1–E5) was generated, the largest of which was E5

comprising Pf Msp1-19\_EGF1, Pf Msp8\_EGF1/2, Pf Msp4\_EGF, and Pfs25 (**Figure 7A**). Heat-treated extracts from syringeinfiltrated leaf material at 5 dpi were analyzed by SDS-PAGE and western blot. As shown **in Figure 7B**, the smallest expression construct (construct E1, featuring Pf Msp1-19\_EGF1 alone) was not detected in the heat-treated extract by a His<sup>6</sup> tag-specific western blot, but the presence of the protein was confirmed, alone and in the context of the different successor fusion proteins, using a conformational, reduction-sensitive Pf Msp1- 19\_EGF1-specific monoclonal antibody in a dot-blot under native conditions (**Figure 7C**). Although construct E2 combining Pf Msp1-19\_EGF1 and Pf Msp4\_EGF showed weak expression, the larger variants E3, E4, and E5 accumulated to satisfactory levels that were easily detected by Coomassie staining. A 15 kDa C-terminal degradation fragment comprising the His<sup>6</sup> tag observed in the case of construct E3 and a weaker 25 kDa His6 taged degradation fragment observed for construct E5, indicate that the fusion proteins are not completely resistant to proteolytic cleavage.

To provide material for immunization studies, E5 was produced and purified from 200 g of vacuum infiltrated N. benthamiana leaves. Purification by IMAC and IEX yielded a final 4 mg preparation of highly pure (>90%) protein (**Figure 8A**) corresponding to 15µg of purified E5 per gram of fresh leaf material.

The purified E5 fusion antigen was used to immunize mice and the resulting immune sera were analyzed for E5 titers as well as single domain-specific titers by ELISA. As shown in **Figure 8B,** the immune response was directed against all domains and the average titer against the single domains ranged from around 70,000 for Pf Msp1-19\_EGF1 to around 450,000 for Pfs25. The titer measured against the full fusion antigen was >500,000. For crude ELISA data refer to data sheet 1. Additionally, the ability of the E5 fusion antigen to induce antibodies that recognize P. falciparum antigens in their native context was determined by immunofluorescence analysis involving different parasite preparations from the blood and sexual stages. **Figures 8C,D** show that the E5-specific mouse immune sera bind to parasite

surface proteins in their native context (schizonts, blood stage parasites, **Figure 8C**, and macrogametes, sexual stage parasites, **Figure 8D**) and provide an indication that Pf EGFs are correctly folded when combined within the E5 fusion protein.

## Construction and Expression of the Multi-Domain, Multi-Stage Vaccine Candidate M8

After gaining promising results with the P3 and E5 fusion proteins, we next used the transient expression platform for the stepwise generation of the multi-domain, multi-stage vaccine candidate fusion protein M8. Starting from a four EGFdomain variant featuring Pf Msp1-19\_EGF1, Pf Msp8\_EGF1/2, and Pf Msp4\_EGF, we subsequently added seven additional previously screened domains (**Figure 9A**) representing all three main parasite stages, and analyzed the accumulation and heat stability of the fusion proteins by SDS-PAGE and western blot. **Figure 9B** shows that all eight multi-domain fusion proteins were detected in heat-treated crude extracts when the proteins

were separated by SDS-PAGE under reducing conditions and stained with Coomassie (**Figure 9B**, left side). The proteins were also detected by His<sup>6</sup> tag-specific western blot (**Figure 9B**, right side). In both cases, the proteins migrated as expected for their molecular weights. The western blot revealed a number of higher-molecular-weight aggregates especially for the larger fusion proteins (**Figure 9B**, right side, lanes 3– 8) as well as small amounts of (His6-tagged) degradation products. Two of the degradation products previously observed for RFP-PfCelTOS (D1, D2, **Figure 5B**) were also observed for the PfCelTOS-containing fusion proteins (**Figure 9B**, right side, lanes 3, 7, and 8) with increasing size correlating with the size of the fusion partners distal to the PfCelTOS component.

FIGURE 8 | Initial characterization of E5-specific immune responses. (A) SDS-PAGE analysis of purified E5. M, PageRulerTM pre-stained protein ladder (Fermentas); E5, 15 µg of purified E5 under reducing conditions. (B) Domain-specific titer analysis of E5-specific murine immune sera. Titers were derived by direct-coating ELISA against purified single-domain RFP-fusions (data not shown) and are defined as the dilution that gives more than twice the value of pre-immune serum. Titers for the individual animals are given (M1–M3) as well as the geometric mean (horizontal solid black line). For domain identification refer to Table 1. (C,D) Immunofluorescence assay of P. falciparum NF54 parasites at two different stages. (C) Schizonts (blood stage) and (D) macrogametes (sexual stage) were fixed with methanol on the surface of a slide. Detection with IgGs from mice immunized with E5 is shown as a representative example. Rabbit antisera, PfMsp1-19 (schizonts) and Pfs25 (magrogametes) were used as positive controls. Rabbit controls were visualized with an anti-rabbit secondary antibody labeled with Alexa Fluor 594 (red) whereas murine immune IgG was visualized with a secondary Alexa Fluor 488 labeled anti-murine antibody (green). (I) murine immune IgG; (II) counterstaining with stage-specific rabbit antiserum; (III) neutral mouse serum; (IV) counterstaining with stage-specific rabbit antiserum.

# DISCUSSION

#### RFP-Fusion Protein Based Expression Screening as Vaccine Development Tool

Over the last few decades, a large number of P. falciparum antigens from different lifecycle stages have been proposed as potential malaria vaccine candidates (Moorthy and Hill, 2001; Todryk and Hill, 2007). Many of them have been tested in preclinical animal studies, some yielding promising results, others yielding ambiguous, or negative outcomes. Taken together, the available data suggest that one route toward a more efficacious malaria vaccine could be the combination of different antigens or antigen domains to broaden the immune response against multiple targets, even targets from different stages to achieve multi-stage protective efficacy (Hill, 2011).

Based on these considerations, we implemented a malaria vaccine development program using our well-established plant transient expression platform based on A. tumefaciens. The screening of RFP-fusion protein expression provided an efficient tool to rapidly assess the expression and solubility of large numbers of vaccine antigens. RFP supports both N-terminal and C-terminal fusions, accumulates to high levels in different subcellular compartments and has been used to analyze protein targeting and/or localization (Pasare et al., 2013), as a visual marker for transgene expression (Rademacher et al., 2008; Sack et al., 2015), to address and compare the efficiency of transfection methodologies (Leuzinger et al., 2013), and to develop predictive models for transient gene expression in tobacco (Buyel and Fischer, 2012). The strong fluorescence of the mature homotetrameric structure is simple to detect and quantify, making it an ideal reporter for expression screening.

The syringe infiltration of single leaves allows the straightforward comparative testing of up to six constructs in a single plant, using an A. tumefaciens suspension prepared from a 5-mL culture grown in standard test tubes. Visual inspection of RFP fluorescence in planta provides the first readout of construct functionality, whereas solubility and heat stability can be assessed visually at the macroscopic or microscopic levels in planta, and after extract preparation, processing, and centrifugation. Heat stability is favorable because it facilitates the removal of HCPs during downstream processing (Buyel et al., 2014; Voepel et al., 2014; Beiss et al., 2015) and potentially translates into favorable storage and shelf-life properties. In the context of our panel of 14 candidates, this screen quickly identified poorly expressed proteins such as Pf SPATR\_TSR and Pf Msp10\_EGF2, as well as insoluble proteins such as Pf Msp10\_EGF1 and Pf TRAP\_TSR.

In addition to immunological and clinical considerations, information about domain folding and the number of disulfide bridges was an important criterion during the selection of our panel of P. falciparum antigens. In contrast to the large, often redundant and unstructured characteristics of P. falciparum surface antigens (Feng et al., 2006), both EGF-like and TSR domains are highly structured and rich in disulfide bridges. This pre-selection process explains why we did not find any heat-sensitive proteins during the screen. The use of RFP as a fusion partner provided an additional advantage because it enabled the analysis of small proteins (like single EGF-like domains) which sometimes do not accumulate to detectable levels when expressed alone. One example is Pf Msp1-19\_EGF1, which was subsequently introduced successfully into multidomain fusion proteins (Boes et al., 2015). Furthermore, RFPdomain fusion proteins are also needed for the deconvolution of antibody responses against vaccine candidates comprising multiple domains and components.

#### Assembly of the Pre-erythrocytic Stage Candidate P3

After selecting soluble and heat stable pre-erythrocytic stage antigens using the transient expression platform to screen RFPfusion proteins, we generated sequentially elongated fusion protein variants and developed the novel pre-erythrocytic multidomain vaccine candidate P3. We confirmed that P3, the largest variant composed of three different antigens or antigen domains, was expressed as an intact fusion protein at satisfactory levels and remained soluble after heat treatment to remove HCPs.

P3 was thus regarded as a promising pre-erythrocytic vaccine candidate and was selected for further investigations. The production, purification, and characterization of P3 (renamed CCT in later studies) has been described in detail elsewhere, including results confirming the in vitro inhibition of pre-erythrocytic parasite stages by P3-specific mouse immune sera (Voepel et al., 2014). P3 was also selected as part of a plant-derived multi-stage vaccine cocktail that underwent detailed characterization in rabbit immunization studies and in vitro parasite inhibition assays at different life cycle stages (Boes et al., 2015). These studies confirmed its pre-erythrocytic stage efficacy (up to 80% inhibition of pre-erythrocytic parasites in vitro) following the immunization of rabbits. Upstream production, and downstream process development for the large-scale transient expression of P3 in N. benthamiana is presented in a separate publication in this issue, including the optimization of buffer conditions and the heat treatment process (Menzel et al., in preparation).

## Construction, Expression Purification, and Characterization of the Multi-EGF Dual-Stage Vaccine Candidate E5

By analogy to the construction of the single stage (preerythrocytic stage) multi-domain vaccine candidate P3 we performed the generation of the dual-stage (blood stage and sexual stage) multi-domain vaccine candidate E5 as an assembly of EGF-like domains from three different blood stage antigens and one sexual stage antigen. The EGF-like domain is a small (30–40 amino acids) and well-conserved fold found in many proteins from diverse species and typically features three intradomain disulfide bounds (Wouters et al., 2005). Even though the EGF-like domain is stable, our initial experiments with the EGF-like domains from different P. falciparum antigens revealed large differences in expression and accumulation (data not shown). The RFP-fusion expression screening approach made it possible to reject problematic EGF-like domains such as Pf Msp10\_EGF1 and 2 and only combine stronglyexpressed soluble domains to construct the E5 vaccine candidate antigen.

The purified protein was used in an initial mouse immunization study to determine the titers against the fullsize fusion protein and its individual domains. The overall titers of >500,000 indicated robust immunogenicity, and although the small number (three) of animals used in this immunization study prevented thorough statistical evaluation, the strong immune response against the largest component (Pfs25, four EGF-like domains) compared to the other components (Pf Msp1-19\_EGF1, Pf Msp8-\_EGF1, Pf Msp8\_EGF2, and Pf Msp4\_EGF, each featuring one EGF-like domain) matches our observations concerning multi-domain, multi component vaccines in previous studies (Boes et al., 2015; Spiegel et al., 2015) confirming the positive correlation between component size and immune response and thereby provides useful guidelines for the development of a multi-domain, multi-stage, multi-allele vaccine cocktail.

# Construction and Expression of the Multi-Domain, Multi-Stage Vaccine Candidate M8

We also used the workflow described above for the stepwise construction and expression of a large multi-domain, multistage vaccine candidate named M8. Based on the selection of P. falciparum antigen domains with confirmed strong expression, integrity, heat stability, and suitability as components of fusion antigens, we were able to produce a complex fusion protein consisting of 11 different promising vaccine candidate antigens and antigen domains in the context of a heat stable, moderately expressed, and proteolytically stable protein. Following the optimization of downstream processing, the M8 vaccine candidate will be produced, purified, and tested in animal immunization studies and parasite growth inhibition assays using the three main stages of the P. falciparum life cycle. These results will be particularly interesting with respect to conclusions that have been drawn in the context of results the authors have generated by immunization experiments with the pre-erythrocytic vaccine candidate P3 (Voepel et al., 2014), a sexual stage vaccine candidate (F0) containing Pfs25 and Pfs230\_Co (Beiss et al., 2015) as well as with a malaria multi-component vaccine cocktail (amomg others including P3 and F0; Boes et al., 2015) and multistage fusion proteins (Spiegel et al., 2015). In these studies it could be shown that the combination of candidates that provide good in vitro efficacy may lead to unwanted reduction of titers against single components by antigenic competition. On the other hand low immunogenic components like Pf Msp1- 19 seem to profit from presentation in the context of fusion proteins that provide additional t-cell epitopes. Taking the available data together it becomes clear that these effects are poorly predictable. The authors observed an unproportionally high immune response against PfCSP\_TSR in the context of P3, while in the studies working with larger fusion proteins and cocktails the distribution of component-specific titers did roughly correlate with the molar quantity of the component within the mixture and/or fusion. It will be interesting to see what advantages and disadvantages can be observed in the case of M8 that combines interesting pre-erythrocytic, sexual, and blood stage antigens in the context of a large fusion protein.

## CONCLUSIONS AND FUTURE PERSPECTIVES

In this study we have demonstrated the successful application of a plant-based transient expression platform as an essential tool for the development of multi-domain, multi-stage malaria vaccine candidates. The simple and robust workflow based on syringe infiltration requires only a small number of plants and small culture volumes. This allows the convenient analysis several samples in parallel in a time frame from gene to protein of <10 days. The method can easily be scaled up from 100µg scale (leaf discs) realized by syringe infiltration of leaves up to 100 g (whole leaves) scale realized by vacuum infiltration of whole plants, enabling the convenient generation of mg-amounts of target protein for advanced analytics or animal immunization studies. Providing a generic toolbox, the established workflow and methodologies can be easily translated to successfully address any vaccine candidate identification and development task.

#### AUTHOR CONTRIBUTIONS

HS participated in the design of the experiments, performed the experiments, performed data interpretation, and drafted the manuscript. AB participated in the design of the experiments, performed the experiments, performed data interpretation, and drafted the manuscript. NV participated in the work related to the pre-erythrocytic stage vaccine candidate P3 and helped in drafting the manuscript. VB was involved in antigen selection and performed the cloning of sexual stage antigen constructs and helped in drafting the manuscript. GE was involved in antigen selection and performed the cloning of TSR-domain fusion constructs and helped in drafting the manuscript. TR participated in the design of the experiments and performed the cloning and testing of RFP-control constructs. MS participated in the study design and the data interpretation and revised the manuscript. SS participated in study design and coordination, and helped to draft the manuscript. AR participated in the study design and the data interpretation, and revised the manuscript. RF conceived of the study, participated in its design and revised the manuscript.

#### REFERENCES


#### FUNDING

This work was funded by the Fraunhofer-Zukunftsstiftung (Fraunhofer Future Foundation).

#### ACKNOWLEDGMENTS

MRA-94 (mAb 5.2) deposited by David C. Kaslow was obtained through the MR4 as part of the BEI Resources Repository, NIAID, NIH. The authors are grateful to Ibrahim Al Amedi (RWTH Aachen University, Germany) for cultivating the tobacco plants used in this study. We thank Matthias Scheuermayer (University of Wuerzburg) for performing the immunofluorescence assays and Thomas Schmelter (Fraunhofer IME, Aachen, Germany) for assistance with image acquisition. We also thank Leonie Fritsch for assistance with the Agilent 2100 Bioanalyzer and Julia Zischewski for providing genomic DNA. We thank Alessandro Vitale (Institute of Agricultural Biology and Biotechnology, Milano) for providing a plasmid featuring the protein body targeting sequence. We also thank Dr. Richard M Twyman (Twyman Research Management Ltd., York, UK) for manuscript editing.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 01169

candidate-tackling the cocktail challenge. PLoS ONE 10:e0131456. doi: 10.1371/journal.pone.0131456


vaccines in plants. Trends Plant Sci. 6, 219–226. doi: 10.1016/S1360-1385(01)0 1922-7


RTS,S/AS01 malaria vaccine in African infants. N. Engl. J. Med. 367, 2284–2295. doi: 10.1056/NEJMoa1208394


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Spiegel, Boes, Voepel, Beiss, Edgue, Rademacher, Sack, Schillberg, Reimann and Fischer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# N-Glycosylation of Cholera Toxin B Subunit: Serendipity for Novel Plant-Made Vaccines?

#### Nobuyuki Matoba\*

*Department of Pharmacology and Toxicology and Owensboro Cancer Research Program of James Graham Brown Cancer Center, University of Louisville School of Medicine, Owensboro, KY, USA*

The non-toxic B subunit of cholera toxin (CTB) has attracted considerable interests from vaccinologists due to strong mucosal immunomodulatory effects and potential utility as a vaccine scaffold for heterologous antigens. Along with other conventional protein expression systems, various plant species have been used as production hosts for CTB and its fusion proteins. However, it has recently become clear that the protein is *N*-glycosylated within the endoplasmic reticulum of plant cells—a eukaryotic post-translational modification that is not present in native CTB. While functionally active aglycosylated variants have been successfully engineered to circumvent potential safety and regulatory issues related to glycosylation, this modification may actually provide advantageous characteristics to the protein as a vaccine platform. Based on data from our recent studies, I discuss the unique features of *N*-glycosylated CTB produced in plants for the development of novel vaccines.

#### Edited by:

*Edward Rybicki, University of Cape Town, South Africa*

#### Reviewed by:

*Markus Sack, RWTH Aachen University, Germany Ann Meyers, University of Cape Town, South Africa*

> \*Correspondence: *Nobuyuki Matoba n.matoba@louisville.edu*

#### Specialty section:

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

Received: *16 October 2015* Accepted: *29 November 2015* Published: *22 December 2015*

#### Citation:

*Matoba N (2015) N-Glycosylation of Cholera Toxin B Subunit: Serendipity for Novel Plant-Made Vaccines? Front. Plant Sci. 6:1132. doi: 10.3389/fpls.2015.01132* Keywords: Cholera toxin B subunit, N-glycosylation, plant-made pharmaceutical, subunit vaccine, C-type lectin receptors

#### INTRODUCTION

Cholera toxin B subunit (CTB) is a non-toxic component of cholera holotoxin, the virulence factor of Vibrio cholerae (Baldauf et al., 2015). The subunit non-covalently assembles into a homopentamer structure, which allows for high-affinity interaction with its receptor GM1 ganglioside present on the surface of mammalian cells. A recombinant, bacterial fermentationderived CTB is included in an oral cholera vaccine (Dukoral <sup>R</sup> ), which has been used in Sweden since 1991 and granted a marketing authorization throughout the European Union by the European Commission in 2004 (European Medicines Agency, 2014). Accordingly, CTB represents one of a few recombinant subunit vaccines currently approved for human use, and it is the only one that is capable of eliciting an effective immune response via oral delivery. Upon oral administration, CTB induces a robust antibody response in systemic and mucosal compartments, thereby neutralizing the holotoxin secreted by the bacteria. Such a strong oral immunogenicity makes CTB among the most potent mucosal immunogens described to date (Lycke, 2012), and therefore the protein provides an attractive vaccine platform for the induction of a protective antibody response to heterologous antigens. Meanwhile, recent studies have shown that CTB has unique anti-inflammatory activity against immunopathological conditions in allergy and inflammatory diseases (reviewed in Sun et al., 2010; Baldauf et al., 2015). For example, oral administration of CTB was shown to mitigate Crohn's disease in humans (Stål et al., 2010). A human 60 kD heat-shock protein (HSP60)-derived peptide, p336–351, was chemically linked to CTB, and this CTB conjugate protein (p336–351-CTB) was shown to prevent relapses of uveitis in Behcet's disease in a phase I/II clinical trial (Stanford et al., 2004). Collectively, CTB is a multifunctional mucosal immunomodulatory protein that serves not only as a cholera vaccine antigen, but also as a molecular scaffold for novel mucosal vaccines and immunotherapeutics. Numerous studies have explored such possibilities for various diseases, which are reviewed elsewhere (Baldauf et al., 2015; Stratmann, 2015).

Since the late 90's, a variety of plant species have been used to constitutively or transiently express CTB and CTB-antigen fusion proteins, including tobacco (Nicotiana tabacum and N. benthamiana), potato, rice and tomato, among others (reviewed in Baldauf et al., 2015). These studies have shown that plantexpressed CTB proteins formed pentamer structure, retained binding affinity to GM1-ganglioside and induced relevant antibody responses upon mucosal immunization. However, we and others have recently shown that plant-expressed CTB (with an exception of chloroplast-targeted expression, e.g., Daniell et al., 2001) is N-glycosylated within the endoplasmic reticulum (ER) of plant cells, a eukaryotic post-translational modification not present in the original protein (Mishra et al., 2006; Matoba et al., 2009; Yuki et al., 2013). While this modification does not appear to compromise CTB's principal bioactivity, i.e., mucosal immunogenicity, plant-specific glycoforms may lead to potential safety issues such as hypersensitivity or allergy (Dicker and Strasser, 2015). It should be noted that plant-specific glycosylation per se does not necessarily pose an additional regulatory risk in biopharmaceuticals development unless there is evidence for product-specific safety and/or efficacy issues found in preclinical or clinical studies. In fact, no major adverse event associated with plant-specific glycosylation has been reported for plant-made biopharmaceuticals that have obtained a regulatory approval for marketing or emergency use [e.g., carrot cellproduced β-glucocerebrosidase (Grabowski et al., 2014; Pastores et al., 2014) and a N. benthamiana-produced H5N1 avian influenza virus-like particle vaccine (Landry et al., 2010; Ward et al., 2014), respectively]. Nevertheless, glycosylation would add a regulatory complication because of glycan heterogeneity. As a consequence of these theoretical concerns, N-glycosylated CTB might be viewed inferior to the non-glycosylated counterpart unless there is a good reason to keep the modification. Based on our recent findings, potential advantages of CTB glycosylation for vaccine development are discussed below.

#### N-GLYCOSYLATION OF CTB IN PLANTS

The first experimental evidence for CTB glycosylation in planta was reported by Mishra et al. (2006). The authors showed that CTB expressed in transgenic tobacco was modified with a ∼3 kD glycan (per monomer), which was demonstrated by Schiff's test, concanavalin A binding, as well as chemical and enzymatic deglycosylation. Subsequently, Asn4 of CTB and a CTB-fusion protein were shown to be glycosylated in transgenic N. benthamiana (Matoba et al., 2009; Hamorsky et al., 2013) and transgenic rice (Yuki et al., 2013). Among the two potential N-glycosylation sites in the amino acid sequence of CTB, one at the near C-terminus (Asn90-Lys91-Thr92; **Figure 1A**) was not glycosylated because the sequon is immediately followed by Pro, which is known to abolish N-glycosylation (Jones et al., 2005). **Figure 1B** shows Asn4-linked glycans modeled in the context of a CTB crystal structure. It is apparent that the glycosylation site is exposed and located away from the GM1-ganglioside-binding pocket, suggesting that the oligosaccharides would not affect the protein's receptor binding affinity. This was experimentally demonstrated in our previous studies based on competitive GM1-ganglioside-capture enzyme-linked immunosorbent assays (ELISA) and surface plasmon resonance. Additionally, N. benthamiana-expressed Asn4-glycosylated CTB (gCTB) showed acid and thermal stabilities as well as oral vaccine efficacy for the induction of immunoglobulins (Igs) against cholera holotoxin that were comparable to those of the non-glycosylated original protein (Hamorsky et al., 2015).

FIGURE 1 | N-Glycosylation of CTB. (A) Amino acid sequence of CTB from *V. cholerae* 569B strain (Protein Data Bank ID: 1FGB). *N*-Glycosylation sequons (Asn-X-Thr/Ser) are boxed. (B) Hypothetical structure images showing CTB homopentamer with high-mannose-type glycans attached to Asn4 positions. Images, top view on the left and side view on the right, were created by the Glyprot *in silico* protein glycosylation tool (http://www.glycosciences.de/modeling/glyprot/) based on the crystal structure of CTB (Protein Data Bank ID: 1FGB) and visualized by RasWin Molecular Graphics (ver. 2.7.5.2). Gray arrows show the positions of GM1-ganglioside-binding pockets. (C) Percent compositions of glycoforms attached to Asn4 of CTB, CTB-KDEL, and CTB-MPR-KDEL expressed in *N. benthamiana*. Data derived from Matoba et al. (2009), Hamorsky et al. (2013, 2015).

**Figure 1C** shows the overall N-glycan profiles of four gCTB proteins that we have expressed in N. benthamiana, either constitutively via nuclear transformation or transiently using a viral vector. Among these, three contained an ER retention signal (KDEL) at the C-terminus. Unlike high-mannose-rich gCTB-MPR-KDEL expressed in transgenic N. benthamiana, gCTB-KDEL produced in transgenic N. benthamiana showed a different glycan profile, with >80% being plant-specific α(1,3)-fucose and/or β(1,2)-xylose glycoforms (Hamorsky et al., 2013). The distinct glycan profiles of these transgenic plant-expressed gCTB proteins likely reflects their difference in subcellular distribution in planta; although both contained a KDEL signal sequence, the fusion protein had a longer extension on the C-terminus due to the 36-amino-acid MPR domain, which might have facilitated KDEL receptor recognition for more efficient ER retention. It is of interest to note that gCTB-KDEL, when transiently overexpressed using a tobamovirus vector, showed an overall similar glycan profile to that of the same protein expressed in transgenic plants (**Figure 1C**), despite that the two expression systems had completely different production rate and yield; ∼3 g of gCTB-KDEL were obtained per kg of leaf material in five days in the transient system (Hamorsky et al., 2015), whereas ∼0.1 g of the protein were constitutively expressed per kg of transgenic leaves (Hamorsky et al., 2013). This suggests that the ER retention efficiency of gCTB-KDEL is similar regardless of the speed of protein biosynthesis. Meanwhile, gCTB devoid of KDEL showed a markedly distinct glycan profile with large fractions of complex glycoforms.

Altogether, the above findings showed the substantial heterogeneity of N-glycans attached to CTB expressed in plants, which in turn revealed the limitation of the KDEL-based ER retention strategy to control such heterogeneity. Cognizant of potential safety concerns and regulatory complications related to glycan heterogeneity and/or plant-specific glycoforms, we and others have developed aglycosylated CTB mutants for vaccine development; we mutated Asn4 of CTB-KDEL to Ser because the closely related E. coli heat-labile enterotoxin B subunit has Ser at the corresponding position (Hamorsky et al., 2013), while Yuki et al. changed Asn4 of CTB (no KDEL) to Gln (Yuki et al., 2013). Both of these CTB variants were shown in animal models to efficiently elicit cholera holotoxin-neutralizing antibodies upon oral immunization, demonstrating that Asn4 mutations did not affect the protein's vaccine efficacy. These results underscore that plant-made aglycosylated CTB variants can serve as an alternative to the bacteria-produced recombinant protein currently used in an oral cholera vaccine product.

## POTENTIAL ADVANTAGES OF N-GLYCANS ATTACHED TO CTB

Given that functionally active aglycosylated CTB variants can be produced in plants, why bother considering the protein's N-glycosylation any further? In this regard, recently we found an interesting function of N-glycosylation for transient overexpression of CTB in N. benthamiana. When an aglycosylated CTB variant (N4S-CTB; no C-terminal KDEL) was expressed using a plant virus vector, the protein induced strong ER stress and massive tissue damage, resulting in a poor yield (<10 mg of GM1-ganglioside-binding CTB per kg of leaf material; Hamorsky et al., 2015). In sharp contrast, gCTB (no KDEL) did not show any significant stress response, either at gene expression (PDI, BiP, and bZIP60) or macroscopic levels. Moreover, the protein was very efficiently expressed and accumulated in a functional pentamer form in leaf tissue. The expression level reached up to 3 g of gCTB per kg of leaf biomass, which is among the highest yields for recombinant protein production in plants reported thus far (Hamorsky et al., 2015). Based on data obtained by gene expression and protein ubiquitination analyses, we concluded that the efficient "nursing" of nascent gCTB polypeptides by lectin chaperones (Molinari, 2007; Aebi, 2013) facilitated the assembly of the pentameric protein in the ER, thereby mitigating unfolded protein response that would otherwise lead to strong ER stress and tissue necrosis. Although the critical role of N-glycosylation in the quality control of newly synthesized proteins has been well known (Helenius and Aebi, 2004; Braakman and Bulleid, 2011), the above study highlighted the significance of such a role for the efficient bioproduction of recombinant glycoproteins in plantbased transient overexpression systems.

Thus, a proven advantage of CTB N-glycosylation is the significant improvement of production yield in plants. Although the aforementioned complications around glycans still need to be addressed, these issues have been under extensive investigations in recent years (Dicker and Strasser, 2015). It is expected that glycoengineering of host plants will soon generate a superior expression platform that can provide recombinant glycoproteins with more uniform, mammalian cell-like glycans (Strasser et al., 2014). In parallel, advances are being made in refining and simplifying glycan analysis technologies that can meet regulatory requirements for biopharmaceutical production (Higgins, 2010; Shubhakar et al., 2015). Given these efforts in the two front lines against glycosylation issues, it is the author's opinion that the currently perceived inferiority of N-glycosylated CTB is not an insurmountable challenge to overcome. Nevertheless, a higher recombinant production yield alone may not sufficiently justify the development of N-glycosylated CTB for pharmaceutical use when the non-glycosylated counterpart with comparable efficacy and safety profiles can be produced in a different production platform. Below two possible scenarios are discussed that could represent additional advantages of N-glycosylated CTB over the non-glycosylated counterpart.

## N-Glycans May Alter the B Cell Antigenicity Profile of CTB

Glycans are generally resistant to humoral immune recognition due to poor immunogenicity (i.e., lack of T cell epitopes) and low antigenicity (i.e., high conformational flexibility; Heimburg-Molinaro et al., 2011; Peri, 2013; Amon et al., 2014). This is particularly true for "self " sugar structures that are found in humans; glycosylation of proteins with conserved mammalian sugars generally diminishes product immunogenicity, as discussed in a recent U.S. Food and Drug Administration (FDA) guidance document for immunogenicity assessment of therapeutic protein (FDA, 2014). Many enveloped viruses take advantage of this unique immunological feature of carbohydrate molecules by using envelope glycans as a "shield" to escape from humoral immunity (Vigerust and Shepherd, 2007). For example, studies have shown that influenza A viruses exploit N-glycans on the globular head of hemagglutinin, where the sialic acid-binding site is located, to mask the critical epitopes recognized by neutralizing antibodies (Tate et al., 2014). Human immunodeficiency virus type-1 (HIV-1) generates neutralization escape mutants in each infected individual by changing its envelope N-glycosylation pattern (Wei et al., 2003). Hepatitis C virus also uses the glycan shield strategy to reduce the humoral immunogenicity of envelope proteins and mask neutralizing epitopes (Helle et al., 2011). These observations point to a possibility of utilizing N-glycosylation to modify CTB's antigenic profile. In line with this notion, a recent study has shown that N-glycosylation of a malaria antigen (PfAMA1) produced in N. benthamiana has modified the protein's antigenicity by shielding multiple amino acid epitopes from humoral immune recognition (Boes et al., 2015).

**Figure 2A** shows the reactivity of a commercial anti-CTB antiserum to varying concentrations of gCTB and the bacteriaproduced non-glycosylated counterpart. The results clearly show the masking of a significant portion of CTB's surface epitopes that are recognized by the polyclonal antibodies, illustrating the alteration of the protein's antigenicity by Asn4-attached glycans. It is noteworthy that, despite such an antigenic masking effect, gCTB still raised comparable anti-cholera holotoxin IgA and IgG responses as native CTB upon oral administration in mice (**Figure 2B**; Hamorsky et al., 2015). These results indicate that Asn4-linked glycans modify the B cell antigenicity profile of CTB without affecting the protein's overall immunogenicity. Accordingly, one testable hypothesis based on these findings is that the glycans may redirect antibodies to recognize CTB's structural domains that are away from the glycosylation site, such as the foreign antigen moiety in the case of CTB-antigen fusions. We have previously observed that N. benthamiana-expressed CTB-MPR, an Asn4-glycosylated CTB-fusion vaccine candidate against HIV-1, could generate a measurable vaginal IgA response to the HIV-1 antigen in intranasally immunized mice, which

FIGURE 2 | Immune-related effects of N-glycans attached to CTB. The potential impacts of Asn4-glycans on humoral immunity (A, B) and dendritic cell-specific intracellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) binding (C, D) are shown. (A) Reactivity of a commercial polyclonal anti-CTB antibody product (List Biological Laboratories) to native CTB (Sigma-Aldrich) and Asn4-glycosylated CTB expressed in *N. benthamiana* (gCTB; no C-terminal KDEL attached). An ELISA plate was coated with GM1-ganglioside, to which varying concentrations of native CTB or gCTB were added. The receptor-bound CTB or gCTB were detected by incubation with the polyclonal antibodies followed by anti-goat IgG secondary antibodies, as described previously (Hamorsky et al., 2015). Native CTB and gCTB have a comparable affinity to GM1-ganglioside (Hamorsky et al., 2015). The anti-CTB antibodies recognized native CTB significantly better than gCTB, suggesting antigenic masking or a "glycan shield" effect by Asn4 glycans. \*\**P* < 0.01; \*\*\**P* < 0.001; Two-way repeated measures analysis of variance (ANOVA) with Bonferroni's posttest (GraphPad Prism 5). (B) Serum and Fecal anti-cholera holotoxin antibody titers of C57bl/6 mice orally immunized with native CTB or gCTB (3µg per mouse, twice at a 2-week interval; graphs adapted from Hamorsky et al., 2015, under the Creative Commons Attribution License). (C) DC-SIGN-binding activity of gCTB and an aglycosylated plant-made CTB (N4S-CTB). An ELISA plate was coated with varying concentrations of gCTB, gCTB produced in plants treated with kifunensin (Kif) or N4S-CTB, to which a human DC-SIGN-Fc fusion (Sino Biological) was added. The bound DC-SIGN was detected with an anti-human IgG Fc secondary antibody. (D) gCTB's binding to cell-surface DC-SIGN. Raji cells expressing DC-SIGN were incubated with Alexa Fluor® 488-labeled N4S-CTB-KDEL, gCTB, or gCTB (Kif) at a final concentration of 10µg/ml, and analyzed by flow cytometry. \*\**P* < 0.01, \*\*\**P* < 0.001; One-way ANOVA with Bonferroni's multiple comparison test (GraphPad Prism 5). Graphs adapted from Hamorsky et al. (2015), under the Creative Commons Attribution License.

seemed to be more effective than that induced by E. coli-derived CTB-MPR although immunization regimens and immunogen qualities in those studies were not comparable (Matoba et al., 2004, 2006, 2009). If this observation is confirmed in a side-byside comparison study, it will provide an important implication for CTB fusion-based vaccine development because the CTB domain tends to be more immunodominant than bystander antigens fused to the scaffold (Matoba et al., 2006). Hence, Nglycosylated CTB may serve as a superior vaccine platform to the non-glycosylated counterpart for the induction of a better antibody response to genetically or chemically fused antigens.

## N-Glycans May Enhance the Antigen-Targeting Ability of CTB via Interaction with C-Type Lectin Receptors

Complex sugars present on microorganisms, cell surfaces and glycoconjugates have a capability to elicit unique signals in the immune system by interacting with C-type lectin receptors. These carbohydrate-binding receptors are abundantly expressed on innate immune cell membranes, most notably antigen presenting cells such as dendritic cells and macrophages (Drickamer and Taylor, 2015). Since C-type lectin receptors are endocytic, glycosylated antigens are internalized after binding to the receptors and subsequently presented on major histocompatibility complex (MHC) class I and II molecules. Antigen presenting cells can then activate effector or regulatory T cell responses in cooperation with other co-stimulatory signals. An early study has shown that mannosylated peptides and proteins were efficiently taken up by dendritic cells via mannose receptors, a type of C-type lectin receptors, resulting in 200– 10,000 times more efficient antigen presentation to T cells than non-mannosylated counterparts (Tan et al., 1997). Given this, a number of studies have attempted to exploit C-type lectin receptors to efficiently deliver and present antigens to T cells (Apostolopoulos et al., 2013; Lepenies et al., 2013; van Kooyk et al., 2013; Sedaghat et al., 2014). For instance, oligomannose-coated liposome was shown to be capable of delivering encapsulated protein antigens to the MHC class I and class II pathways in antigen presenting cells and thereby generating antigen-specific cytotoxic T cells and type 1 helper T cells (Kojima et al., 2013). These findings provide an implication for the potential use of N-glycosylated CTB.

Among different C-type lectin receptors expressed on dendritic cells, dendritic cell-specific intracellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) is a major receptor that recognizes mannose-containing glycans. We tested if gCTB could bind to DC-SIGN using ELISA and flow cytometry. The results demonstrated that gCTB is capable of binding to recombinant and cell surface-expressed DC-SIGN (**Figures 2C,D**; Hamorsky et al., 2015). Notably, gCTB's binding affinity to DC-SIGN was significantly enhanced to a nanomolar level when the protein was produced in plants treated with the class I α-mannosidase inhibitor kifunensin, which restricts N-glycans to be high-mannose types (**Figures 2C,D**; Hamorsky et al., 2015). Such a high affinity to DC-SIGN is considered sufficient for dendritic cell targeting, internalization and cross presentation (Srinivas et al., 2007; Singh et al., 2009). Taken together, these findings lead to another testable hypothesis that N-glycosylation may enhance the antigen-targeting capabilities of CTB fusion proteins via DC-SIGN and other C-type lectin receptors. Particularly, the ability of C-type lectin receptors to cross-present antigens on the MHC class I molecule will broaden the potential utility of gCTB for vaccine development.

# CONCLUDING REMARKS

The above two proposed scenarios highlight how N-glycosylation of CTB may facilitate the protein's utility as a vaccine scaffold. Glycoengineering of N-glycans by genetic or chemical approaches may enhance such potentials, especially by focusing on high-mannose-type glycans since these glycoforms per se are generally not immunogenic in mammalians. Hence, for mucosal antibody induction these glycans may effectively guide B cells to recognize critical epitopes of CTB-antigen fusion proteins. On the other hand, high-mannose-glycans may facilitate the targeting of CTB-antigen fusion to C-type lectin receptors on antigen presenting cells, providing a new strategy to induce antigen-specific T cell responses. However, an important question remains to be addressed for the C-type lectintargeting strategy; that is, whether glycosylation of CTB may or may not modify the protein's intrinsic immunomodulatory activity. As described above, CTB was shown to exhibit anti-inflammatory and immunosuppressive activities under certain conditions. Depending on how antigen presenting cells respond upon stimulation with N-glycosylated CTB, vaccine development based on the glycosylated molecular scaffold should be aimed at either effector (e.g., for cancer and infectious diseases) or regulatory (e.g., for allergy and autoimmune disorders) T cell responses, perhaps in combination with appropriate co-stimulatory molecules such as cytokines and tolllike receptor ligands. Because C-type lectin-mediated signaling is not fully understood (Drickamer and Taylor, 2015), the above question needs to be carefully addressed for each vaccine construct. Immunization experiments using N-glycosylated CTB antigens and corresponding non-glycosylated counterparts will be particularly useful in addressing these questions. Regardless of how N-glycosylated CTB instructs the immune system to respond, the protein seems to open new avenues for subunit vaccine development; the bottom line is that Nglycosylated CTB is highly bioproducible in plants, a trait that can maximize the long-discussed advantages of plant-made vaccines.

# AUTHOR CONTRIBUTIONS

NM solely conceived and wrote the manuscript.

# ACKNOWLEDGMENTS

This work was funded by DoD/USMRAA/TATRC/W81XWH-10-2-0082-CLIN1; W81XWH-10-2-0082-CLIN2 and the Helmsley Charitable Trust.

# REFERENCES


Yuki, Y., Mejima, M., Kurokawa, S., Hiroiwa, T., Takahashi, Y., Tokuhara, D., et al. (2013). Induction of toxin-specific neutralizing immunity by molecularly uniform rice-based oral cholera toxin B subunit vaccine without plant-associated sugar modification. Plant Biotechnol. J. 11, 799–808. doi: 10.1111/pbi.12071

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author has filed a patent application concerning the concepts described in this manuscript (U.S. Patent Application serial no. 14/005,388).

Copyright © 2015 Matoba. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Depth Filters Containing Diatomite Achieve More Efficient Particle Retention than Filters Solely Containing Cellulose Fibers

#### Johannes F. Buyel 1, 2 \*, Hannah M. Gruchow1, 2 and Rainer Fischer 1, 2

*1 Integrated Production Platforms, Fraunhofer-Institute for Molecular Biology and Applied Ecology IME, Aachen, Germany, 2 Institute for Molecular Biotechnology, RWTH Aachen University, Aachen, Germany*

The clarification of biological feed stocks during the production of biopharmaceutical proteins is challenging when large quantities of particles must be removed, e.g., when processing crude plant extracts. Single-use depth filters are often preferred for clarification because they are simple to integrate and have a good safety profile. However, the combination of filter layers must be optimized in terms of nominal retention ratings to account for the unique particle size distribution in each feed stock. We have recently shown that predictive models can facilitate filter screening and the selection of appropriate filter layers. Here we expand our previous study by testing several filters with different retention ratings. The filters typically contain diatomite to facilitate the removal of fine particles. However, diatomite can interfere with the recovery of large biopharmaceutical molecules such as virus-like particles and aggregated proteins. Therefore, we also tested filtration devices composed solely of cellulose fibers and cohesive resin. The capacities of both filter types varied from 10 to 50 L m−<sup>2</sup> when challenged with tobacco leaf extracts, but the filtrate turbidity was ∼500-fold lower (∼3.5 NTU) when diatomite filters were used. We also tested pre–coat filtration with dispersed diatomite, which achieved capacities of up to 120 L m−<sup>2</sup> with turbidities of ∼100 NTU using bulk plant extracts, and in contrast to the other depth filters did not require an upstream bag filter. Single pre-coat filtration devices can thus replace combinations of bag and depth filters to simplify the processing of plant extracts, potentially saving on time, labor and consumables. The protein concentrations of TSP, DsRed and antibody 2G12 were not affected by pre-coat filtration, indicating its general applicability during the manufacture of plant-derived biopharmaceutical proteins.

Keywords: bioprocess costs, clarification and filtration, design of experiments, model building, plant-derived biopharmaceuticals, pre-coat filtration

# INTRODUCTION

The successful launch of Elelyso in 2012 by Protalix Biotherapeutics (Carmiel, Israel) (Tekoah et al., 2015) showed that plants and plant cells are competitive expression systems for biopharmaceutical proteins (Buyel, 2015; Mor, 2015). The production of biopharmaceutical proteins in plants offers distinct benefits such as a low pathogen burden (Commandeur and Twyman, 2005) but

#### Edited by:

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### Reviewed by:

*Udo Conrad, IPK Gatersleben, Germany Sylvain Marcel, Caliber Biotherapeutics, USA*

\*Correspondence: *Johannes F. Buyel johannes.buyel@rwth-aachen.de*

#### Specialty section:

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

Received: *07 October 2015* Accepted: *30 November 2015* Published: *21 December 2015*

#### Citation:

*Buyel JF, Gruchow HM and Fischer R (2015) Depth Filters Containing Diatomite Achieve More Efficient Particle Retention than Filters Solely Containing Cellulose Fibers. Front. Plant Sci. 6:1134. doi: 10.3389/fpls.2015.01134* also challenges such as low expression levels (2–50 mg kg−<sup>1</sup> biomass) (Twyman et al., 2005). Cost-effective production can be difficult in plants compared to established platforms based on mammalian cells, and this makes it harder to achieve commercialization. The costs associated with upstream production (USP) in plants are often low, especially for openfield cultivation (Stoger et al., 2002), but up to 80% of the total production costs are attributed to downstream processing (DSP) (Wilken and Nikolov, 2012; Buyel et al., 2015).

Traditionally, DSP is divided into primary recovery and purification (Menkhaus et al., 2004). Like all other platforms, the purification of plant-derived proteins is based on the intrinsic properties of the target and can be achieved with standard operations such as chromatography (Buyel and Fischer, 2014a). In contrast, primary recovery requires specific clarification steps such as depth filtration or flocculation to address issues that are specific to plants, such as the high particle burden in the feed stream (Buyel and Fischer, 2014b). The particle burden can be reduced if the product is secreted or specialized extraction methods are used, e.g., guttation (Komarnytsky et al., 2000), rhizosecretion (Drake et al., 2009) and infiltration-centrifugation (Turpen, 1999), but these techniques are limited to secreted proteins. Typically, the product accumulates in the plant tissue and must be released from intracellular compartments by homogenization (Hassan et al., 2014; Buyel and Fischer, 2014c,d). The use of blade-based homogenizers releases large amounts of dispersed particles producing extracts with turbidities exceeding 5000 nephelometric turbidity units (NTU) (Buyel and Fischer, 2014e).

Centrifuges can be used for clarification but single-use filters are preferred because these are less expensive, more scalable and do not require cleaning validation (Roush and Lu, 2008; Pegel et al., 2011; O'Brien et al., 2012). Filters can also remove host cell proteins (HCPs) and pigments (Yigzaw et al., 2006; Naik et al., 2012). Single-use depth filters have been identified as the major consumables cost-driver during DSP (Buyel et al., 2014) so additives such as flocculants and filter aids have been tested to improve filter capacity and thus reduce costs (Buyel and Fischer, 2014f). Although these additives are effective, they can also encourage the precipitation of the target protein (Holler et al., 2007), they may be incompatible with subsequent downstream operations (Buyel and Fischer, 2014b), and they may increase safety risks (Buyel and Fischer, 2014e). Therefore, single-use filters that increase filter capacity but use only harmless and easily-removed additives are preferred, and once identified they can reduce DSP costs and thus to improve the economic competitiveness of molecular pharming.

Here we compare the performance of 24 different depth filters containing diatomaceous earth (DE) to a reference filter train used during the manufacture of a plant-derived monoclonal antibody in an 800-L scale process (**Figure 1**) compliant with good manufacturing practice (GMP). We also tested seven filter combinations lacking DE, which can be beneficial if target proteins bind to this charged filter aid. We investigated the scalability of small filtration devices, and finally used a design of experiments (DoE) approach to characterize a DE precoat filtration technology that can potentially simplify tandem filtration systems consisting of a bag filter and a depth filter in series, reducing this to a single clarification unit (**Figure 1**). The performance of all filters was tested using plant extracts containing two model target proteins: the fluorescent protein DsRed and monoclonal antibody 2G12.

# MATERIALS AND METHODS

#### Biological Materials

Seeds of transgenic tobacco (Nicotiana tabacum) line pGFD (Buyel and Fischer, 2014c) were germinated in soil and cultivated in a greenhouse at 25/22◦C day/night temperature with 70% relative humidity. The plants were irrigated with 0.1% (w/v) Ferty 2 Mega (Kammlott GmbH, Germany) for 15 min h−<sup>1</sup> during a 16-h photoperiod (180µmol s−<sup>1</sup> m−<sup>2</sup> ; λ = 400 − 700 nm) and were grown for 50–53 days prior to harvest.

#### Extraction and Filtration

Three volumes (1500 mL) of extraction buffer (50 mM sodium phosphate, 500 mM sodium chloride, 10 mM sodium disulfite, pH 8.0) were added to 500 g of plant material and homogenized in a PT 6100 (Kinematica, Switzerland) customized with a blade tool and a 3-L vessel. Prior to depth filtration, the extract was pre-clarified using a BP-410 bag filter (Fuhr, Klein-Winternheim, Germany) with a nominal retention rating of 1µm. The capacity and efficiency of different depth filters (**Table 1**) with an area of 22-23 cm<sup>2</sup> were tested at a constant volumetric loading flow rate of 12 mL min−<sup>1</sup> using a PDF4 filter (Pall, Dreieich, Germany) as a reference (Buyel and Fischer, 2014c). Turbidities were determined at each process step using a 2100P turbidimeter (Hach, Loveland, CO, USA) as 1:40 (homogenate), 1:10 (bag filtrate, depth filters after particle breakthrough) or undiluted samples (regular depth filtrates, pre-coat filtrates). Conductivity and pH were also monitored at each process step. We also tested filters PDH4 and PDF4 in a Supracap format with authentic filter layer geometry under the same conditions. Depth filters were also compared with single-use Sartolab DY (DY) precoat filters with a filter area of 22 cm<sup>2</sup> (Sartorius, Göttingen, Germany) which were operated at 1500 Pa vacuum using a N816.3KN.18 membrane pump (KNF, Freiburg, Germany) and 150 mL of extract or bag filtrate, with or without 2.0 g L−<sup>1</sup> of the flocculant Polymin P (BASF, Ludwigshafen, Germany). In these initial tests 140 g L−<sup>1</sup> DE Celpure C300 (Sartorius) was added to the feed before filtration. The results were confirmed using an Ioptimal DoE consisting of 14 runs using a scalable custom filter housing (Sartoclear Dynamics (SD), Sartorius) equipped with a Purex filter layer (Sartorius, nominal retention rating of 7-12µm, 12.5 cm<sup>2</sup> filter area) and operated with a feed flow rate of 6.3 mL min−<sup>1</sup> . In the DoE setup, DE concentrations of 25-60 g L−<sup>1</sup> , prefiltration incubation times of 10–90 min and either one or two additions of DE were investigated using bulk plant extract at all

**Abbreviations:** DE, diatomaceous earth; DoE, design of experiments; DSP, downstream processing; DY, Sartolab DY; HCP, host cell protein; NTU, nephelometric turbidity units; RN, retention number; SD, Sartoclear dynamics; SPR, surface plasmon resonance; TSP, total soluble protein; USP, upstream production.

times. Particle size distributions were determined with a Zetasizer NanoZS (Malvern, Malvern, UK) using undiluted samples.

#### Protein Quantitation

Samples from extracts and filtrates were centrifuged twice (16,000 × g, 20 min, 4◦C) and the quantity of total soluble protein (TSP) in the supernatants was determined using the Bradford method (Simonian and Smith, 2006) adapted to a microtiter plate format (Buyel and Fischer, 2014f) with a triplicate standard curve of eight dilutions of bovine serum albumin in the range 0–2000µg mL−<sup>1</sup> . The absorbance at 595 nm was measured for technical triplicates of each sample using a Synergy HT plate reader (BioTek Instruments, Vermont, USA). The same reader fitted with a 530/25 nm (excitation) and 590/35 nm (emission) filter set was used to quantify the concentrations of DsRed in the supernatants. A standard curve was generated with six dilutions of purified DsRed in the range of 0–225µg mL−<sup>1</sup> . Concentrations of antibody 2G12 were determined by surface plasmon resonance (SPR) spectroscopy using an SPR2 instrument (Sierra Sensors, Hamburg, Germany). For each sample, the concentration of 2G12 was measured by binding to protein A, which was immobilized on the surface of a high capacity amine chip (Sierra Sensors) by EDC/NHS coupling, and comparison to a 1.0µg mL−<sup>1</sup> reference solution of 2G12 used for one-point calibration (Buyel and Fischer, 2014c). HBS-EP+ (10 mM HEPES, pH 7.4, 150 mM NaCl, 3 mM EDTA, 0.05% v/v Tween-20) was used as the running buffer.

#### RESULTS

#### Clarification of Plant Extracts Using Conventional Depth Filters

We tested 24 conventional depth filter setups containing DE, 19 of which consisted of a single filter (two filter layers) and five as tandem filters (four filter layers). The test was conducted in five runs and capacities were normalized to a PDF4 reference filter (**Figure 1B**) which achieved a capacity of 35 ± 6 L m−<sup>2</sup> (n = 10; two per run; **Figure 2A**). Filters F6 and F15 outperformed PDF4 in terms of capacity by 25 and 13%, respectively. However, the turbidities of the F6 and F15 filtrates were 14 and 23 NTU, respectively, compared to 4 ± 2 NTU (n = 10) for PDF4. None of the tandem filters we tested achieved capacities greater than that of PDF4.

A dimensionless retention number (RN) has previously been used to pre-select filter layer combinations that are likely to


*Filters without diatomaceous earth are shown in italics, and filters with scalable geometry are shown in bold. The filter used in our standard process is bold underlined. Filters achieving a similar capacity are underlined. The filter code is used in all subsequent figures for brevity.*

achieve high capacities (Equation 1) (Buyel and Fischer, 2014c).

$$RN = \frac{\frac{r\_1}{r\_2} + \frac{r\_2}{r\_3} + \dots + \frac{r\_l}{r\_j} + \dots + \frac{r\_{n-1}}{r\_n}}{(n-1) \times n} \tag{1}$$

where n is the total number of layers, r<sup>i</sup> is the nominal retention rating of the more porous layer in each pair of consecutive layers, and r<sup>j</sup> is the nominal retention rating of the finer layer in the pair. We calculated the RN for the 24 filters tested here and found that the average RN of the five new filters with the highest normalized capacity, i.e., ≥1.0, was 5.9 ± 2.7 (n = 5), whereas PDF4 has a RN of 5.0 (**Figure 2B**). We also fitted the normalized filter capacity over the RN data using non-linear peak functions in Origin v9.1 (OriginLab, Northampton, MA). Maximum filter capacities were predicted for RN values of 5.40 (Gaussian), 5.49 (Lorentz), and 4.71 (Giddings). Adjusted R<sup>2</sup> -values of 0.47-0.49 indicated that all fits were in fair agreement with the data. A cubic fit to a previously published data set (adjusted R<sup>2</sup> = 0.76) predicted an optimal RN of 3.37.

For filters with a normalized capacity ≥0.5, the concentrations of TSP (**Figure 2C**), DsRed and 2G12 (**Figure 2D**) fell within one standard deviation around the average observed for the PDF4 reference with F18 as the only exception, which contained less protein. Lower protein concentrations were observed in filtrates if the normalized capacity was below 0.5.

#### Performance of Small-Scale Devices with Authentic Filter Geometry

Small-scale filtration devices can have a layer geometry that differs from that used in process-scale equipment (**Figure 3A**). For depth filters PDF4 and PDH4, we have compared the effect of such different geometries on the filter capacity and protein binding efficiency using regular small-scale equipment (direct flow, regular) and devices mimicking the large-scale layer assembly (indirect flow, Supracap). The Supracap geometry increased the capacity of filter PDF4 significantly by 26% (twosided t-test with 5% alpha level; **Figure 3B**), whereas an 18% increase was observed for filter PDH4, but this was not significant according to a two-sided t-test with a 5% significance level. There was no significant difference in TSP, DsRed or 2G12 concentrations among any of the four types of filter and geometry combinations (**Figure 3B**).

#### Testing Depth Filters without Diatomaceous Earth

The DE-free filter P3 had the smallest retention rating of all the filters we tested here (**Table 1**) and also showed the lowest capacity of only 8 ± 1 L m−<sup>2</sup> (n = 3). The other DE-free filters did not show any relevant increase in back pressure over the first 35 L m−<sup>2</sup> , i.e., the pressure level was ∼0.02 MPa (0.2 bar; data not shown). Among these filters, M3 and M8 most effectively retained dispersed particles but filtrations were stopped after 35 L m−<sup>2</sup> because in all cases turbidity was reduced only by a factor of 2-5 compared to the feed, a bag-filtered plant extract (**Figure 4A**). In contrast, P3 reduced the turbidity to 2 ± 1 NTU (n = 3), corresponding to a 1000-fold reduction and was similar to that of the reference filter PDF4.

The particle size distribution revealed that the P3 and PDF4 filtrates almost exclusively contained particles in the 10–20 nm range, similar to that observed for a 0.2µm filtrate (**Figure 4B**). In contrast, the majority of particles in the M3 filtrate had a size

of 1000-2000 nm, similar to the dominant particle populations in the bulk plant homogenate and bag filtrate.

We did not observe significant differences (two-sided t-test with 5% alpha level) in the TSP and DsRed concentrations of the different depth filtrates regardless of the presence or absence of DE (**Figure 4C**). The concentrations of antibody 2G12 were on average 38 ± 11% (n = 18) higher in filtrates from the DE-free filters compared to the DE-containing PDF4 reference. But this difference was only significant when comparing M4 and PDF4 (two-sided t-test with 5% alpha level). Furthermore, when we expressed the 2G12 concentration in the depth filtrate as a percentage recovery of the bag filtrate (the feed stream to the depth filter), we found that there was no difference between PDF4 and the DE-free filters given the recovery values of 85 ± 10% (n = 10) and 83 ± 9% (n = 18), respectively.

#### Clarification of Crude Plant Extracts by Pre-Coat Filtration

Pre-coat filtration has the potential to replace bag and depth filter trains with a single unit operation (**Figure 1C**). Experiments were conducted with bottle-top devices for initial screening (DY) and then small-scale units with a head space geometry and filter pore size that matched those used in large-scale operations (SD). When bulk plant homogenate was loaded onto the DY filters (H-DY) a capacity of ∼50 L m−<sup>2</sup> was achieved, which matched the capacity of the PDF4 reference filter (B-PDF4) combined with the upstream bag filter. When DY filters were fed with plant extract that had already passed through the bag filter, the capacity dropped to ∼10 L m−<sup>2</sup> . This feed-dependent difference in capacity was reversed if Polymin P was added to the homogenate, i.e., DY filters challenged with the flocculated homogenate (HF) exhibited a lower capacity than the same filters challenged with flocculated bag filtrate (BF) or a PDH4 depth filter control (**Figure 5A**). Using bag filtrate instead of homogenate, or including a flocculant, reduced the filtrate turbidity after DY, but none of these setups achieved the reduction in turbidity possible with conventional depth filters (4 NTU).

The use of pre-coat filtration directly after homogenization is most advantageous from a process design point of view as discussed below (**Figure 1C**). We therefore used a DoE approach to investigate the performance of pre-coat filters at this process step in more detail, using filter housings with authentic geometry and under varying process conditions. Increasing the concentration of DE in the feed stream increased the thickness of the resulting filter cake (**Figure 5B**) but did not accelerate filter blocking at any of the DE concentration we tested. DE concentrations of 50-60 g L−<sup>1</sup> resulted in filter capacities of ∼125 L m−<sup>2</sup> , which was a 2.5-fold higher than that of the PDF4 reference (**Figures 5A,C**). Reducing the DE concentration resulted in lower filter capacities. However, it was possible to compensate for the lower DE concentration by increasing the pre-filtration incubation times for DE and the homogenate to more than 60 min. Adding DE in steps to achieve the final DE concentration in the homogenate did not affect the filter capacity (data not shown).

The turbidity of the SD filtrates was reduced by 80% compared to the homogenate by adding 25-40 g L−<sup>1</sup> DE. In contrast, we observed turbidities of 114 ± 6 NTU (n = 3) if we added 60 g L <sup>−</sup><sup>1</sup> DE to the homogenate, corresponding to a 50-fold reduction compared to the homogenate alone (**Figure 5D**). The turbidity declined during filtration from initial values of ∼500 to 30 NTU at the end of the process.

The use of DE during pre-coat filtration did not affect protein concentrations and recoveries of 103 ± 17, 112 ± 26, and 97 ± 15% (n = 14 in all cases) were observed for TSP, DsRed, and antibody 2G12, respectively.

(homogenate, bag filtrate) and subsequent (0.2 µm filtration) process steps. All measurements were taken with a Zetasizer NanoZS. (C) TSP, DsRed and 2G12 concentrations in different depth filtrates as determined by the Bradford method, fluorescence spectroscopy and SPR spectroscopy, respectively. Error bars indicate standard deviations (*n* ≥ 3). (D) Particle retention by depth filters selected based on actual particle size distributions (F6, left), prediction by *RN* (PDF4, middle) and empirical data (F14, right). Particles retained by the first and second filter layers are colored dark green and green, respectively. Particles in the filtrate are colored blue and those not retained on the first filter layer due to suboptimal selection of the retention rate are colored orange.

# DISCUSSION

#### Particle Size Distribution is More Accurate than RN for the Prediction of Filter Capacity

The capacity of 36 ± 6 L m−<sup>2</sup> we observed for the PDF4 reference filter was in good agreement with the 37 ± 3 L m−<sup>2</sup> reported in a previous publication that established this filter as a standard for plant extract clarification (**Figure 1B**; Buyel and Fischer, 2014c) confirming the comparability of the results presented here with preceding studies. Five of the depth filters we tested here showed similar or higher filter capacities compared to the PDF4 reference. The average RN of these filters was 5.9 ± 2.7 (n = 5) apparently confirming that selecting PDF4 with an RN of 5.0 as the standard depth filter was a reasonable choice to achieve high filter capacity given the data available from a series of filtration experiments. However, the highest capacity was achieved with a filter that had an RN of only 1.73 (F6) showing that predictions based on RN calculations can be inaccurate. This reflects the fact that predictions based on RN are only descriptive in nature, even though they are based on empirical results. In contrast, particle size analysis can reveal the actual distribution of dispersed species in a feed solution and allow the selection of filter layers based on mechanistic considerations. For example, solutions with a bimodal particle distribution can be clarified using two filter layers that have nominal retention ratings corresponding to the lower end of each mode. The use of filter layers with the coarsest applicable retention rate in each mode avoids premature pore blocking. We have observed such bimodal particle distributions in bag-filtered plant extracts with peaks at 0.95 and 7.50µm. **Figure 4D** illustrates how filter F6 improved particle retention compared to the standard PDF4 filter and a suboptimal alternative, like F14, based on such a mechanistic description. This also explains why each of the bestperforming depth filters had in common a second layer with a retention rating of ∼0.6µm, which effectively removed particles representing the peak at 0.95µm. Particle size distributions can therefore be used to facilitate filter layer selection in a rapid and cost-effective manner, but RN can function as a substitute if particle size distribution data are unavailable.

Depth filter capacities of up to 1000 L m−<sup>2</sup> have been reported recently (Buyel et al., 2014) but the performance in those

experiments depended on two additives, a polyethyleniminebased flocculant and a cellulose-based filter aid. Although these improve filter efficiency, the flocculant can be incompatible with subsequent DSP steps (Buyel and Fischer, 2014b) and the cellulose-based filter aid can generate unacceptable amounts of dust (U.S. Department of Health and Human Services, 1995). It is therefore better to optimize depth filter capacities without additives if possible, simply by selecting filter layers with nominal retention ratings matching the particle size distribution in the feed stream. This also helps to reduce production costs and improve compatibility with subsequent DSP steps. There was no significant difference among the five filters with the highest capacities in terms of TSP, DsRed or 2G12 concentrations, confirming that filters from different vendors perform equally well in this respect and process optimization can focus on filter capacities. The reduced protein concentrations observed in the filtrate from other filters with low capacities can be attributed to a dilution effect resulting from residual rinse buffer that is retained in the filter layers after mandatory initial flushing. Based on our experience, this holdup buffer volume is 8-10 L m−<sup>2</sup> for PDF4 and can thus reduce protein concentrations by 50% if the filter capacity is only ∼10 L m−<sup>2</sup> , e.g., for filter F1. These data agree with the 53% reduction in TSP we observed in the filtrate produced by filter F1 compared to PDF4.

#### Filter Capsules with Authentic Geometry Yield Scalable Filter Capacity Values

Filter capsules with authentic geometry channeled the feed stream onto the filter layers indirectly (**Figure 3A**) which probably delayed pore blocking and explained the ∼20% capacity increase compared to the direct stream in conventional smallscale devices. We have used filter PDF4 in a GMP-compliant 800-L scale production process for monoclonal antibody 2G12 and found that the normalized filter capacity was 1.25 ± 0.17 (n = 3) compared to regular small-scale filters. This is in good agreement with the value of 1.26 ± 0.06 (n = 3) we calculated for the small-scale devices with authentic geometry (Supracap) reported above. Therefore, small-scale filtration devices with authentic geometry can achieve scalable depth filter capacities when filters are challenged with plant extracts during early process development. This will facilitate the estimation of production costs based on small-scale data and thus allow the economic evaluation of different process alternatives during the early development phase, ultimately improving the competitiveness of plant-based protein expression systems.

# Depth Filters Lacking Diatomaceous Earth Do Not Reduce the Turbidity of Plant Extracts

Among the DE-free filters we tested, only P3 reduced turbidity by the same amount as the PDF4 reference but this device also offered the lowest filter capacity. The turbidity in the other filtrates was >1000 NTU and thus not compatible with subsequent DSP steps including 0.2-µm filtration and chromatography. One major issue was that green pigments and particles of ∼1µm size passed through all of the DE-free filters except P3. The second layer of these filters had a nominal retention rating of 0.85µm, which is 30-40% wider than the coarsest second layer of DE-containing filters yielding turbidities of 16 ± 7 (n = 5) NTU (0.65µm second layer) or 4 ± 2 (n = 10) NTU (0.60µm second layer). The average fourfold difference in the filtrate turbidity of DE-containing filters with second layers of 0.65 and 0.60µm indicates that this size range marks a limit above which large numbers of particles begin to pass through the filter layers. These data support the particle size distributions we observed for the bag-filtered plant extract loaded onto DE-free and DE-containing filters, which had a first peak of dispersed particles with an average size of 0.95µm (**Figures 4B,D**). It may still be possible to achieve low filtrate turbidity in combination with high filter capacity if new combinations of DE-free filter layers are used. For example, a combination of CE40 or CE30 as a first layer with Bio20 as a second layer would yield DE-free filters with nominal retention ratings of 2.0 + 0.7µm and 3.75 + 0.7µm, respectively, which is close the 2.25 + 0.65µm combination of the best performing DE-containing filter, F6. However, the CE and Bio filter series are produced by different manufacturers and it is thus unlikely that a filter containing both types of layers will become commercially available for pharmaceutical-grade applications in the near future. We therefore did not test this combination.

There was no difference in the TSP and DsRed concentrations of filtrates that passed either DE-free or DE-containing filters. There was also no difference in the recovery of antibody 2G12 if concentrations were normalized to the corresponding plant batch, indicating that DE-free filters offer no improvement in yield for most tobacco HCPs and the two target proteins we tested. However, other target proteins may unexpectedly bind to DE-containing filters (our unpublished data) and the development of an effective clarification strategy using DE-free filters can thus be a worthwhile investment for future processes.

#### Pre-Coat Filtration with Diatomaceous Earth Simplifies the Clarification Process

In the absence of flocculants, DY pre-coat filters achieved higher capacities when fed with bulk plant homogenate instead of bagfiltered extract, even though the latter has a turbidity of 3000- 6000 NTU, (Buyel and Fischer, 2014e) which is 30-80% less than that of the bulk homogenate (Buyel and Fischer, 2014f). This observation appears counterintuitive, but is probably explained by the presence of cellulose fibers and coarse cell debris in the bulk homogenate with a size range of 1000-4000µm,(Buyel and Fischer, 2014b) which helps to form a filter cake and thus increases the filter capacity (Buyel et al., 2014).

The presence of flocculants increased the DY filter capacity when bulk homogenate was used as feed, but the effect was more impressive when bag filtrate was used instead. The 10 µm aggregates that are typically found in flocculated bag filtrate (Buyel and Fischer, 2014b) may help to form a more effective filter cake when combined with the Celpure C300 diatomite compared to the 1-8µm particles in the untreated bag filtrate or the 1000-4000µm particles found in the (flocculated) homogenate.

These results indicated that pre-coat filtration is most effective when applied directly after homogenization in the absence of flocculants because under these conditions the filter capacity was comparable to a reference depth filter but required only a single filtration step instead of bag and depth filters in series (**Figure 1**). This can simplify the clarification procedure and thus improve process control. It can also reduce the time and labor required for process setup and the cost of consumables. Furthermore, if a transient expression system is used, a single-use pre-coat filtration step would eliminate the need for reusable bag filter housing, eliminating the need for cleaning validation and the risk of product carryover (**Figure 1B**).

To confirm these anticipated advantages, we used SD filters that have the same geometry as large-scale production modules. When bulk plant homogenate was used as the feed, SD filtration increased the filter capacity by 2.5-fold compared to DY filters because the SD setup allowed a defined volumetric feed flow rate resulting in a gradual and thus more effective buildup of filter cake. In contrast, DY filtration requires the single-step application of DE-homogenate slurry to the filter and the flow rate is defined by the combined effect of the applied vacuum and the degree of filter blocking. Therefore, DY filters are useful for initial screening purposes, i.e., to investigate the general compatibility of pre-coat filtration with a given clarification requirement, but SD filters can be used to optimize filtration conditions and yield potentially scalable capacities due to their authentic geometry, e.g., they feature the same base filter porosity and head space as production modules.

Even so, the turbidity of the SD filtrates was higher than that observed for DY filters. This may reflect the lower base filter porosity of DY (0.2µm) compared to SD (7-12µm). Based on these retention ratings and the particle size distribution of the bulk homogenate (**Figures 4B,D**) a high turbidity in SD filtrates is expected. However, the turbidity of SD filtrate was still 10 fold lower than that of DE-free depth filters, which had retention ratings of 0.85µm. This shows that the filter cake formed during SD filtration can effectively retain particles even in the 0.5-1.0 µm range. We found that turbidity declined during the course of SD filtration, suggesting that particle retention improved as the thickness of the filter cake increased, as previously reported (Cain, 1984; Smith, 1998). We have found that turbidities of up to 50 NTU are compatible with 0.2µm filtration, achieving capacities exceeding 350 L m−<sup>2</sup> (our unpublished data), but a turbidity of ∼100 NTU as observed for SD filtration may interfere with subsequent DSP steps. One potential way to reduce the turbidity of the SD filtrate is to cycle the feed until a sufficient thickness of cake has accumulated. An initial cake build up phase during clarification has been included in other processes for this purpose (Smith, 1998; Cain, 1984). The protein concentrations of TSP, DsRed and antibody 2G12 were not affected by pre-coat filtration, indicating its general applicability during the manufacture of plant-derived biopharmaceutical proteins.

#### CONCLUSIONS

In this study we show that analyzing the particle size distribution of plant extracts allowed the rational selection of depth filter layers achieving higher capacities than layers selected based on the previously suggested descriptive RN model. We also demonstrate for the first time that small-scale filters with authentic geometry provide reliable capacity data for filtration scale-up of plant-based processes using an 800-L GMP-compliant production scale process as a reference. Furthermore, we highlight that new DE-free depth filters hold the potential to improve product recoveries during the clarification of plant extract due to reduced protein binding and provide

#### REFERENCES


recommendations for the selection of the according filter layers. Finally, we underline that implementing a new pre-coat filtration strategy in the clarification procedure simplifies the process stream, reduces the number of unit operations and increases the compatibility of the primary processing with single usetechnologies. These findings will help to increase the economic competitiveness of plant-based processes compared to traditional fermentation and cell culture approaches.

#### FUNDING

This work was funded in part by the European Research Council Advanced Grant "Future-Pharma," proposal number 269110, the Fraunhofer-Zukunftsstiftung (Fraunhofer Future Foundation) and the Frauhofer-Gesellschaft Internal Programs under Grant No. Attract 125-600164.

#### ACKNOWLEDGMENTS

The authors are grateful to Ibrahim Al Amedi and Dr. Thomas Rademacher for cultivating the tobacco plants used in this study. We wish to thank Dr. Richard M. Twyman for editorial assistance.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Buyel, Gruchow and Fischer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Production of h5n1 influenza Virus Matrix Protein 2 ectodomain Protein Bodies in Tobacco Plants and in insect cells as a candidate Universal influenza Vaccine

*Sandiswa Mbewana1 , Elizabeth Mortimer1 , Francisco F. P. G. Pêra1 , Inga Isabel Hitzeroth1 \* and Edward P. Rybicki1,2*

*1Biopharming Research Unit, Department of Molecular and Cell Biology, University of Cape Town, Rondebosch, South Africa, 2 Institute of Infectious Disease and Molecular Medicine, Faculty of Heath Science, University of Cape Town, Cape Town, South Africa*

#### *Edited by:*

*Joachim Hermann Schiemann, Julius Kühn-Institut, Germany*

#### *Reviewed by:*

*Inge Broer, University of Rostock, Germany Basavaprabhu L. Patil, ICAR-National Research Centre on Plant Biotechnology, India*

*\*Correspondence:*

*Inga Isabel Hitzeroth inga.hitzeroth@uct.ac.za*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Bioengineering and Biotechnology*

*Received: 28 September 2015 Accepted: 23 November 2015 Published: 08 December 2015*

#### *Citation:*

*Mbewana S, Mortimer E, Pêra FFPG, Hitzeroth II and Rybicki EP (2015) Production of H5N1 Influenza Virus Matrix Protein 2 Ectodomain Protein Bodies in Tobacco Plants and in Insect Cells as a Candidate Universal Influenza Vaccine. Front. Bioeng. Biotechnol. 3:197. doi: 10.3389/fbioe.2015.00197*

The spread of influenza A viruses is partially controlled and prevented by vaccination. The matrix protein 2 ectodomain (M2e) is the most conserved sequence in influenza A viruses, and is therefore a good potential target for a vaccine to protect against multiple virus subtypes. We explored the feasibility of an M2e-based universal influenza A vaccine candidate based on the highly pathogenic avian influenza A virus, H5N1. A synthetic M2e gene was human- and plant-codon optimized and fused in-frame with a sequence encoding the N-terminal proline-rich domain (Zera®) of the γ-zein protein of maize. Zera®M2e was expressed transiently in *Nicotiana benthamiana* and *Sf*21 baculovirus/ insect cell expression systems, and Zera®M2e protein bodies (PBs) were successfully produced in both expression systems. The plant-produced Zera®M2e PBs were purified and injected into Balb/c mice. Western blot analysis using insect cell-produced Zera®M2e PBs and multiple tandem M2e sequences (5xM2e) fused with the avian influenza H5N1 transmembrane and cytosolic tail (5xM2e\_tHA) confirmed the presence of M2e-specific antibodies in immunized mice sera. The immunogenicity of the Zera®M2e indicates that our plant-produced protein has potential as an inexpensive universal influenza A vaccine.

Keywords: influenza A virus, M2e, plant expression, insect cell expression, vaccine

# INTRODUCTION

Influenza A viruses can be highly contagious, causing acute viral respiratory diseases seasonally in the human population (Cox and Subbarao, 1999; Thompson et al., 2003). Currently, vaccination with selected inactivated influenza virus strains is the most effective way of reducing the morbidity and mortality caused by these viruses (Cox and Subbarao, 1999). Influenza A viruses are divided into subtypes by their two surface glycoproteins, the hemagglutinin (HA) and neuraminidase (NA). HA and NA are the primary targets for vaccine development as they elicit neutralizing immune responses (Johansson et al., 1989; Fiore et al., 2009). Unfortunately, due to the high mutation rate of these glycoproteins, influenza vaccines need to be manufactured seasonally in order to be effective against the current circulating strains (Webster et al., 1992). It would therefore be ideal to develop a universal vaccine that is cross-protective against multiple influenza A virus strains as well as against subtypes (Price et al., 2010; Andersson et al., 2012).

There is a type III integral membrane protein (M2) present on the surface of the influenza A virus particle (Lamb et al., 1985). It functions as a pH-activated ion channel (viroporin) and is required for viral infection (Black et al., 1993). It also prevents low pH-induced structural changes in HA during maturation (Sugrue and Hay, 1991), and thus plays a role in viral assembly (Chen et al., 2008; Rossman et al., 2010). M2 (97 amino acids) consists of an N-terminal ectodomain (M2e) (23 amino acids), a lipid bilayer spanning single transmembrane domain (19 amino acids) and a C-terminal cytosolic tail (54 amino acids), and polymerizes into homotetramers in the virion envelope (Pinto and Lamb, 2006). The M2e sequence is highly conserved in all influenza A viruses (Black et al., 1993; Betakova, 2007). It also plays an important role in the incorporation into virions (Park et al., 1998), and it can elicit antibodies that can neutralize virion infectivity (Fiers et al., 2004; Feng et al., 2006). Thus, this domain has the potential to be used as an influenza A virus universal vaccine. However, M2e is covered by the HA and NA proteins in intact virions, and it is therefore unable to react effectively with immune effector cells, making it poorly immunogenic (Lamb et al., 1985; Jegerlehner et al., 2004; Feng et al., 2006). In attempts to enhance its immunogenicity, M2e has been linked to different carrier molecules, such as the TLR5 ligand flagellin (Huleatt et al., 2007; Mardanova et al., 2015), the surface of virus-like particles (VLPs) (Matic et al., 2011), and as a fusion peptide on β-glucuronidase (Firsov et al., 2015) and HA (Stanekova et al., 2011). De Filette et al. (2005) used hepatitis B virus core protein (HBc) as a carrier, and fused M2e to either the C-terminal of HBc or inserted in the immune dominant loop of HBc. Immunization of mice with this HBc-M2e candidate vaccine resulted in 100% protection against lethal challenge (Fiers et al., 2004; De Filette et al., 2005, 2008).

Influenza antigens have been successfully produced using insect cell expression systems. Baculovirus-expressed influenza vaccines can be produced rapidly, which is necessary when taking into account that currently circulating strains need to be assessed annually. The resultant insect cell-expressed product is considered to be both safe and of a high standard (Safdar and Cox, 2007; Cox, 2008). Producing M2e tetramers in baculovirus insect cell expression systems and accumulating them into nanoclusters results in increased humoral and cellular immunogenicity (Wang et al., 2014).

As an alternative approach, numerous researchers have successfully produced influenza antigens in tobacco plants – and in particular HA (D'Aoust et al., 2008; Shoji et al., 2008; Mortimer et al., 2012). Plant expression systems are advantageous due to their ability to carry out post-translational modifications similar to other eukaryotes, and in rapidly producing large quantities of antigen. This system is more economical since plants do not need expensive material for growth and maintenance, and it reduces concerns over human pathogens contaminating vaccine preparations (Nemchinov and Natilla, 2007; Gomez et al., 2009; Rybicki, 2009). Nemchinov and Natilla (2007) developed a candidate plant-based universal influenza vaccine by displaying the M2e epitope on the capsid protein (CP) of cucumber mosaic virus Ixora strain (CMV-Ix) in a potato virus X (PVX)-based vector. The resulting plant-based chimeric CMV capsids reacted specifically to antibodies raised against the synthetic M2e, indicating the potential of this system.

This study forms part of an ongoing initiative to investigate and establish a rapid-response vaccine production platform to deal with future influenza pandemics in South Africa. The highly pathogenic H5N1 influenza A virus, with a mortality rate of up to 60% in humans (http://www.who.int/), was chosen for this purpose. To date, human-to-human transmissions are limited but the likelihood of H5N1 mutating into a strain that facilitates transfer necessitates efficient pandemic vaccination preparedness strategies and awareness (Webster and Govorkova, 2006; Imai et al., 2013; Kaplan and Webby, 2013). To date, potential plantproduced subunit HA vaccines (Mortimer et al., 2012) as well as HA DNA vaccine candidates (Mortimer et al., 2013) have been created as part of this South African initiative.

Fusion of small or soluble proteins to a signal sequence that drives the assembly and sequestration of the protein bodies (PBs) (Torrent et al., 2009) can significantly increase the immunogenicity of the protein. Accordingly, for this study, we investigated the fusion of a consensus M2e sequence to a signal tag (Zera®, ERA Biotech) that targets the recombinant protein to form PBs. This tag has previously been shown to dramatically improve yields of non-structural papillomavirus protein (E7SH) in plants as well as to have adjuvant properties (Whitehead et al., 2014). Zera® had adjuvant activity, whether fused to E7SH or simply added to it, which could be highly advantageous in a vaccine candidate. The N-terminal proline-rich domain of maize γ-zein (Zera® tag) is characterized by 4 domains: these are a 19 amino acid signal peptide, a repeat domain containing 8 repeats of the sequence PPPVHL, a Pro-X domain including numerous proline residues as well as a hydrophobic cysteine rich C-terminal domain, and lastly, a sequence that retains it in the endoplasmic reticulum (ER). This allows for the formation of membrane-bound PBs, thereby protecting the recombinant protein from proteolytic degradation inside the host cells, and concentrating and sequestering the recombinant protein. PBs are easily concentrated and partially purified by simple centrifugation, and the polypeptide is generally water soluble in the presence of reducing agents, which greatly facilitates and simplifies recombinant protein purification (Torrent et al., 2009).

For this study, we determined the feasibility of creating an immunogenic M2e candidate vaccine by transiently expressing Zera®M2e PBs in tobacco plants and in insect cells. The protein expressed by recombinant baculovirus cells *Sf21* was used as an experimental control reagent. M2e was fused to Zera® to enable protein purification and to increase the immunogenicity of the protein (Torrent et al., 2009; Whitehead et al., 2014). Codon optimization has been widely used to enhance protein expression in heterologous systems (Gouy and Gautier, 1982). The Zera®M2e gene was codon optimized for this study such that it either displayed characteristics of abundantly expressed plant genes (*Nicotiana benthamiana* codon optimized) or human genes (human-codon optimized), as we have found it necessary to empirically determine codon preferences in other studies (Maclean et al., 2007). Immunogenicity of the PBs isolated from plants was established by immunization of mice, and analysis of the immune sera for the presence of antibodies against M2e.

#### MATERIALS AND METHODS

#### Identification and Synthesis of Zera**®**M2e Peptide

Multiple avian and human influenza A H5N1 virus M2e sequences were retrieved from GenBank and aligned using Clustal X (Larkin et al., 2007). From these, four sequences were selected (EU590690, EU590684, EU146698, and EU263984) to create a consensus sequence, SLLTEVETPTRNEWECRCSDSSD, which corresponded exactly to the EU263984 sequence [A/ human/China/GD02/2006(H5N1)] (**Figure 1**). To create the Zera®M2e sequence, the Zera® sequence (ERA Biotech), including an enterokinase cleavage site (DDDDK) (Whitehead et al., 2014), was synthesized and inserted upstream of the M2e consensus sequence. The Zera®M2e nucleotide sequence was both plant- and human-codon optimized, and synthesized by GeneArt (Germany).

#### Construction Plant Recombinant Vector

For plant expression, both plant- and human-codon optimized Zera®M2e were cloned into the plant expression vector pTRAc (GenBank ID: AY027531) using *Afl*III and *Xho*I restriction enzyme sites (pTRAc-Zera®M2e). The pTRAc vector allows protein expression in the cytoplasm (Maclean et al., 2007), with subsequent targeting to the ER by the Zera® sequence. The plasmids were transformed into *E. coli* DH5α and recombinant bacterial colonies were confirmed by PCR using Zera®M2e primers (Fw: 5′-ATGCGGGTGCTGCTGGTC-3′ and Rev: 5′-TGGGTGTCTCCACCTCGGTC-3′). The integrity of the plasmids was confirmed by restriction digest mapping with *Xho*I and *Afl*III restriction enzymes as well as sequencing. The pTRAc-Zera®M2e plasmids were subsequently transformed into *Agrobacterium tumefaciens GV3101* via electroporation (Maclean et al., 2007).

#### Expression and Purification of Zera®M2e in *Nicotiana benthamiana*

*Agrobacterium tumefaciens*-mediated transient expression in *N. benthamiana* plants was performed according to Mortimer et al. (2012). In short, recombinant plant- and human-codon optimized pTRAc-Zera®M2e plasmids were vacuum infiltrated into 6-week-old plants, with co-infiltration of *A. tumefaciens LBA4404* (pBIN-NSs) containing the NSs gene silencing suppressor of tomato-spotted wilt virus (TSWV) (Marcel Prins, Laboratory of Virology, Wageningen, The Netherlands); this enhances gene expression by suppressing post-translational gene silencing (Takeda et al., 2002).

Infiltrated plant tissue was harvested 8 days post infiltration (dpi), followed by grinding in liquid nitrogen with a mortar and pestle, after which the extract was homogenized in the Zera® extraction buffer [100 mM Tris (pH 8), 0.5M NaCl, 50 mM MgCl2, and 10 mM EDTA]. The homogenate was filtered through two layers of Miracloth (Merck) and purified by ultracentrifugation (Beckman SW32Ti rotor) at 21,600 × *g* for 2 h through a 60% sucrose cushion.

Protein expression was assessed by western blot analysis, with proteins resolved on 15% SDS-PAGE gels. The primary antibody, rabbit anti-Zera® polyclonal antibody (provided by ERA Biotech, Spain), was used at a dilution of 1:7000 together with a secondary goat anti-rabbit antibody (Sigma, Steinheim, Germany) at 1:7000 dilution. Nitro blue tetrazolium chloride/5-bromo-4 chloro-3-indolyl phosphate (NBT/BCIP) phosphate substrate (KPL, Gaithersburg, MD, USA) was used for detection. Plant-produced Zera®M2e was quantified by comparing band intensities of the Zera®M2e to known bovine serum albumin (BSA) concentrations by gel densitometry (Gene Genius Bioediting system, Syngene).

#### Construction and Expression of Zera**®**M2e in Insect Cells

For insect cell expression, plant- and human-codon optimized Zera®M2e was cloned into the pFastBac Dual vector (InVitrogen, Carlsbad, CA, USA) between the polyhedrin (PPH) promoter and Tn7L terminator using *Eco*RI and *Pst*I restriction sites, resulting in pFastBac-Zera®M2e. Recombinant plasmids screened by PCR with pFastBac primers (Fw: 5′-GATGGTGGGACGGTATGAATAATCC-3′ and Rev: 5′-GGTATTGTCTCCTTCCGTGTTTGA-3′). The integrity of the plasmids was confirmed by plasmid mapping with *Eco*RI and *Pst*I restriction enzymes and sequencing. Recombinant bacmid DNA was obtained by transposition of pFastBac-Zera®M2e into *E. coli* DH10Bac according to the manufactures instructions (InVitrogen, Carlsbad, CA, USA).

Recombinant baculoviruses (rBV) *Sf 21* cells containing plantand human-codon optimized Zera®M2e were generated, and plaque assays to determine rBV titers were performed according to the Bac-to-Bac© baculovirus expression system manufacturer's protocols (InVitrogen, Carlsbad, CA, USA). TC Plates were stained with 1 g/ml neutral red solution (Sigma, Steinheim, Germany) to

FIGURE 1 | Avian and human influenza A H5N1 virus M2e sequences retrieved from GenBank and aligned using Clustal X. EU590690 turkey, EU590684 houbara bustard, EU263984 human, and EU146698 human. The 23 amino acid ectodomain is indicated by the red square. Differences in the amino acid sequence are indicated in different colors.

visualize individual plaques. Protein expression and purification analysis are as described for the plant-produced proteins.

#### Animal Trials and Serum Analysis

Only the plant-produced Zera®M2e PB yields were judged to be sufficient for animal trials. Accordingly, 20 female Balb/c mice (7 weeks old) were divided into two groups: (a) plantproduced Zera®M2e PB and (b) PBS negative control group. A dose of 4.5 μg Zera®M2e PB was administered intramuscularly (I.M.) to mice, into each anterior *tibialis* muscle. Four doses were administered at 2-week intervals on days 0, 14, 28, and 31. Pre-vaccination serum was collected 3 days prior to vaccination. Following vaccination, sera were collected before each boost (on days 14, 28, and 31), and stored at −20°C until for further analysis. Eleven days after the final dose, animals were euthanized (day 42). The animal experiments were approved by the University of Cape Town's (UCT) Animal Ethics Committee (HSFAEC 009/001).

Western blots were performed to determine the presence of Zera®M2e-specific antibodies in the mouse sera. Our insect cell- and plant-produced Zera®M2e PB samples were resolved separately on 15% SDS-PAGE gels, followed by transfer onto a nylon membrane (Armersham, Bioscience, UK) by semi-dry blotting (BioRad Hercules, CA, USA). The membranes were cut into individual strips and were incubated in 1:5000 dilutions of serum from mice injected with pTRAc-Zera®M2e PB and PBS, respectively. The secondary goat anti-mouse antibody (Sigma, Steinheim, Germany) was used at 1:7000 dilutions. As a positive control, the commercial rabbit polyclonal anti-M2 antibody (ab65086) (Abcam, Cambridge, UK) was used at a 1:5000 dilution followed by the secondary goat anti-rabbit antibody (Sigma), at a 1: 7000 dilution.

To assess if the sera did not only bind the Zera® but also the M2e, western blots were performed with a construct encoding multiple M2e (5xM2e) fused with the avian influenza H5N1 (A/ Vietnam/1194/2004 H5N1) transmembrane and cytosolic tail (5xM2e\_tHA). Crude plant extract containing the 5xM2e\_tHA protein was resolved on 12% SDS-PAGE gels, and the protein was probed on blots with a 1:100 sera dilution. As a positive control, 5xM2e\_tHA was probed with 1:5000 anti-M2 monoclonal antibody (14C2). Alkaline phosphate-conjugated goat anti-mouse IgG was used as a secondary antibody at a 1:10,000 dilution.

#### RESULTS

#### Expression of Recombinant Protein in *N. benthamiana*

Gene-codon optimization has been shown to significantly enhance gene expression in plants, and especially if the genes have a high GC content. The Zera®M2e gene was plant- and human-codon optimized and successfully cloned into plant expression vector pTRAc. Protein expression analysis revealed that both plant- and human-codon optimized Zera®M2e PB were successfully expressed in *N. benthamiana* 8 dpi. In western blots, an expected band of 17 kDa corresponding to the M2e epitope fused with Zera® tag was observed. There were no differences in the expression levels between the plant- and human-codon optimized Zera®M2e, indicating that codon optimization did not influence gene translation and expression in the plant expression system (**Figure 2A**).

#### Expression of Recombinant Protein in Insect Cells

For insect cell expression, plant- and human-codon optimized Zera®M2e were successfully cloned into pFastBac Dual vector under the control of the pH10 promoter and expressed in *Sf*21 insect cells 72 h post infection (hpi) (**Figure 2B**). The plant-codon optimized gene expression was weak as assessed by western blot compared to the human-codon optimized gene. In this case, human-codon usage was more favorable for the insect cell expression system.

#### Purification of Recombinant Protein

The PBs can be easily purified by centrifugation through a sucrose cushion (Torrent et al., 2009). Ultracentrifugal concentration through a sucrose cushion showed that the Zera®M2e PBs formed insoluble pellets. The pelleted insect cell-produced Zera®M2e was detectable on western blots but was undetectable on Coomassiestained SDS-PAGE gels, indicating low protein concentration. The concentrated and partially purified plant-produced Zera®M2e PB was visible on Coomassie-stained SDS-PAGE gels and was quantified by comparing it to known BSA concentrations run on stained gels (**Figure 3**). Bands corresponding to the monomeric (17 kDa), dimeric (34 kDa), and tetrameric (51 kDa) forms of the Zera®M2e PB were observed (**Figure 3**). Only the band corresponding to the putative monomeric form was used for quantitation, to give estimated expression levels ranging from 125 to 205 mg Zera®M2e PB/kg fresh weight (FW) as measured

FIGURE 2 | Western blots of human- and plant-codon optimized recombinant Zera®M2e protein bodies (PBs). Protein expression was detected with anti-Zera® antibody and goat anti-rabbit antibody. (A) Transient expression in *N. benthamiana*. Plants were harvested 8 days post infiltration. Lane 1 contained PageRuler™ Prestained protein ladder (Fermentas), lane 2 contained the Zera®M2e plant-codon optimized protein, lane 3 contained the Zera®M2e human-codon optimized, and lane 4 contained non-infiltrated control plant. (B) Expression of recombinant baculovirus in *Sf*21 insect cells, 72 h post infection (hpi), using the Bac-to-Bac© baculovirus expression system. Lane 1 contained the PageRuler™ Prestained protein ladder (Fermentas), lane 2 contained the recombinant human-codon-optimized Zera®M2e PB, lane 3 is the recombinant plant-codon-optimized Zera®M2e, and lane 4 contained the negative cell lysate transfection control.

by densitometry. The higher yielding plant-produced Zera®M2e PB was used for animal trials.

#### Animal Serum Analysis

Ten mice were immunized with the plant-produced Zera®M2e PB. No clinical manifestation was observed after the injection of the PBs in any of the mice. After the fourth immunization, the sera were analyzed for the presence of Zera®M2e-specific antibodies. Western blots indicated that the immune sera successfully reacted with the plant-produced Zera®M2e PB at a dilution of 1:5000, indicating a high Zera®M2e-specific antibody titer. Sera from PBS-inoculated control mice did not bind Zera®M2e PB (Figure S1 in Supplementary Material).

When the mouse sera were tested with our plant-produced Zera®M2e PB, a high level of background was observed on the western blots even when the sera were diluted 1:40,000, which made it difficult to identify the expected band sizes (Figure S1 in Supplementary Material). This is because our candidate vaccine was produced in plants, and therefore the sera also reacted with high specificity to plant proteins contaminating the preparations. In an attempt to lower the background for more accurate results, the sera were then analyzed using our insect cell-produced Zera®M2e PB (**Figure 4**). The immune sera reacted far more specifically, with a distinct band corresponding to the monomeric Zera®M2e PB.

To be certain whether the mouse sera produced reacted specifically with the M2e peptide, the sera were tested by western blot against the plant-produced 5xM2e\_tHA, which contains no Zera® sequence (**Figure 5**). The serum strongly reacted with the 35-kDa multiple protein dimer and a 70-kDa trimeric form, affirming that indeed the mouse sera did not only have antibodies against Zera® peptide but also against the M2e sequence.

FIGURE 4 | Detection of Zera®M2e-specific antibodies in plantproduced Zera®M2e immunized mice sera. Insect cell-produced Zera®M2e protein bodies (PBs) were loaded in each lane, and then the membrane was cut into strips and probed with individual mouse serum. Lane 1 contains PageRuler™ Prestained protein ladder (Fermentas), lane 2 contains the positive control, i.e., Zera®M2e PB detected with a commercial M2 primary antibody (1:5000) (ab65086, Abcam, Cambridge, UK). Lane 3–6 were detected with mice sera from mice immunized with plant-produced Zera®M2e PB (1:5000) and Lane 7 contains negative control sera: mice immunized with PBS.

FIGURE 5 | Confirmation of specificity of the Zera®M2e mice sera against the fused M2e plant-produced protein. Equal volume of plant-produced 5xM2e\_tHA proteins were loaded in each lane. The membranes were cut in the middle and each half was probed either with the commercial M2 antibody or mice serum from mice immunized with plant-produced Zera®M2e PB. Lanes 1 and 5 contain *Agrobacterium*infiltrated crude plant extract (negative control), Lanes 2 and 4 contain the 5xM2e\_tHA plant crude extract, and Lane 3 contains PageRuler™ Prestained protein ladder (Fermentas). Lanes 1 and 2 were detected with commercial M2 (14C2) primary antibody (1:5000). Lanes 4 and 5 were detected with mice sera from mice immunized with plant-produced Zera®M2e PB (1:100).

#### DISCUSSION

In this study, we attempted to overcome the limitations of the current influenza vaccines with regards to antigenic shift and drift associated with HA and NA (Webster et al., 1992), by focusing on the influenza A virus M2e peptide as a universal vaccine candidate, due to its high degree of conservation since the emergence of the highly virulent Spanish flu pandemic strain of 1918 (Fiers et al., 2004). Generally, M2e vaccine candidates are produced either as chemically conjugated molecules or by genetic fusion to a variety of carrier proteins, such as virion- or VLP-forming proteins (Mozdzanowska et al., 2003; Ionescu et al., 2006; Tompkins et al., 2007; Denis et al., 2008; Matic et al., 2011; Stanekova et al., 2011; Ravin et al., 2012). In our study, we fused the M2e with a self-aggregating signal tag, Zera®. The Zera® fusion tag has the ability to segregate the protein from the plant secretory pathway into membrane-delimited organelles in the ER: the retention and the self-assembly of γ-zein lead to PB formation, stabilizing the protein inside vesicles formed by invagination of the ER lumen (Mainieri et al., 2004; Torrent et al., 2009). This is known to facilitate protein purification, as shown by the result of ultracentrifugation of the plant-produced Zera®M2e PBs through a 60% sucrose cushion: PBs were concentrated in the pellet fraction, with very little soluble protein present.

As the 2009 H1N1 "swine flu" pandemic illustrated, South Africa will have to rely on developed countries for vaccine supplies during an outbreak (Mortimer et al., 2012, 2013). We are therefore systematically investigating the feasibility of establishing rapid-response platforms to produce influenza virus vaccine candidates in South Africa, by implementing novel strategies that are both cost-effective and can be more readily up-scaled than traditional egg-based vaccines (Mortimer et al., 2012).

In the present work, we were able to express Zera®M2e in both plants and insect cells. The protein was expressed as monomers, dimers, and tetramers. This is in line with previous work, where M2e was expressed in baculovirus expression system (Holsinger and Lamb, 1991; Sugrue and Hay, 1991). In plants, we achieved a yield of 125–205 mg/kg FW for the Zera®M2e. These are generally higher yields than those that have been achieved by other researchers. Nemchinov and Natilla (2007) previously reported yields of avian influenza A CMV-M2e fusion protein expression in *N. benthamiana* of 6–8 mg/kg leaf tissue. Matic et al. (2011) expressed influenza M2e epitopes on chimeric HPV VLPs in plants and obtained 78–120 mg/kg plant material. Most recently, Firsov et al. (2015) expressed M2e fused with β-glucuronidase in transgenic duckweed, and obtained yields from 90 to 970 mg/kg plant FW.

When the serum from mice vaccinated with plant-produced Zera®M2e PB was used to detect the same protein produced in insect cells on western blots, the serum specifically detected only the Zera®M2e protein (**Figure 4**), indicating that the plantproduced antigen was immunogenic. Use of the plant-produced protein, however, showed that the sera also reacted with other plant proteins that co-purified with the Zera®M2e PBs. Zera® fusions are known to form large polymers that are resistant to degradation (Torrent et al., 2009). The contaminating proteins could include chloroplastic, ribosomal, cytoplasmic, cytoskeleton, and mitochondrial plant proteins (Joseph et al., 2012). It is clear that using a different expression system for antibody detection from that which was used for the antigen production, allowed for more efficient and clear detection of the protein. Future work will include ultracentrifugation on a sucrose density step gradient (Whitehead et al., 2014) to remove unwanted plant protein: this should then lead to more specific immune responses only to the Zera®M2e PB, with reduced reaction against other plant proteins. Testing the Zera®M2e antigen for protection against multiple strains of influenza would be advantageous but was not possible in this investigation.

While we successfully produced Zera®M2e PB in insect cells in this work, and this was valuable as a reagent, our plantproduced Zera®M2e PB had by far the highest yield, with the different soluble forms (monomeric, dimeric, trimeric, and tetrameric) probably contributing significantly to the immunogenicity of the candidate vaccine. Bands corresponding to the monomeric, dimeric, and tetrameric forms have also been detected in previous M2e studies (Holsinger and Lamb, 1991; Sugrue and Hay, 1991).

The production of M2e as a fusion product in plants is not new; Nemchinov and Natilla (2007) expressed M2e in *N. benthamiana* via a plant viral vector as an internal fusion in the CP of Cucumber mosaic virus; Ravin et al. (2012) used a similar vector to express M2e fused to the HBc and showed protection against lethal challenge in mice. Matic et al. (2011) used the pEAQ-*HT* vector to transiently express M2e and a shortened version (M2e2–9) as fusions to HPV-16 L1 protein; the longer peptide was presented on capsomers and VLPs, and reacted with anti-M2e antibodies. Petukhova et al. (2013) produced recombinant tobacco mosaic virus particles presenting M2e on their surfaces and showed these were highly immunogenic and protective.

However, our use of the Zera® peptide is novel – moreover, the high yield and the immunogenicity of the product, coupled with what is almost certainly a far easier purification protocol than for any of the fusions detailed above, make it a valuable addition to our rapid-response armory against pandemic influenza. Our work is therefore further proof that plants are a viable vehicle for high-level expression of a peptide vaccine known to elicit broadspectrum protection against influenza A viruses.

To conclude, we successfully expressed Zera®M2e PB in both insect cell and plant expression systems. Our plant-produced Zera®M2e elicited M2e-specific antibodies in mice, which indicates that it has potential as a candidate universal influenza vaccine. Future work will look at determining the efficacy of these antibodies and their potential of broad-spectrum protection against influenza strains.

#### AUTHOR CONTRIBUTIONS

SM created the expression constructs, carried out transient expression experiments, and drafted the manuscript; LM carried out insect cell expression, supervised the work, and participated in drafting of the manuscript, FP created and expressed 5xM2etHA in plants; IH designed, coordinated, and supervised the study and participated in drafting of the manuscript; ER initiated study and participated in drafting of manuscript.

#### ACKNOWLEDGMENTS

We would like to thank Rodney Lucas and Noel Markgraaff at UCT for their assistance in the animal trial and Pau Marzabal for critical reading of the manuscript. Rainer Fischer is thanked for kindly providing the pTRA vectors (Fraunhofer Institute, Germany), ERA Biotech, Spain for Zera® sequence and anti-Zera® antibody. Owen Karimanzira is thanked for producing the recombinant insect cell expression vectors.

#### FUNDING

This research was funded by the Poliomyelitis Research Foundation (PRF 07/02) and Medical Research Foundation

#### REFERENCES


(MRC), South Africa. EM was funded by the National Research Foundation.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/ fbioe.2015.00197


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Mbewana, Mortimer, Pêra, Hitzeroth and Rybicki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Production of the Main Celiac Disease Autoantigen by Transient Expression in *Nicotiana benthamiana*

*Vanesa S. Marín Viegas1, Gonzalo R. Acevedo2, Mariela P. Bayardo3, Fernando G. Chirdo3 and Silvana Petruccelli1\**

*<sup>1</sup> Centro de Investigación y Desarrollo en Criotecnología de Alimentos (CIDCA), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) – Departamento de Ciencias Biológicas, Facultad de Ciencias Exactas, Universidad Nacional de La Plata (UNLP), La Plata, Argentina, <sup>2</sup> Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina, <sup>3</sup> Instituto de Estudios Inmunológicos y Fisiopatológicos (IIFP), Consejo Nacional de Investigaciones Científicas y Técnicas – Departamento de Ciencias Biológicas, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Argentina*

#### *Edited by:*

*Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Chiara Lico, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy Linda Avesani, University of Verona, Italy*

> *\*Correspondence: Silvana Petruccelli silvana@biol.unlp.edu.ar*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 15 October 2015 Accepted: 16 November 2015 Published: 01 December 2015*

#### *Citation:*

*Marín Viegas VS, Acevedo GR, Bayardo MP, Chirdo FG and Petruccelli S (2015) Production of the Main Celiac Disease Autoantigen by Transient Expression in Nicotiana benthamiana. Front. Plant Sci. 6:1067. doi: 10.3389/fpls.2015.01067*

Celiac Disease (CD) is a gluten sensitive enteropathy that remains widely undiagnosed and implementation of massive screening tests is needed to reduce the long term complications associated to untreated CD. The main CD autoantigen, human tissue transglutaminase (TG2), is a challenge for the different expression systems available since its cross-linking activity affects cellular processes. Plant-based transient expression systems can be an alternative for the production of this protein. In this work, a transient expression system for the production of human TG2 in *Nicotiana benthamiana* leaves was optimized and reactivity of plant-produced TG2 in CD screening test was evaluated. First, a subcellular targeting strategy was tested. Cytosolic, secretory, endoplasmic reticulum (C-terminal SEKDEL fusion) and vacuolar (C-terminal KISIA fusion) TG2 versions were transiently expressed in leaves and recombinant protein yields were measured. ER-TG2 and vac-TG2 levels were 9- to 16-fold higher than their cytosolic and secretory counterparts. As second strategy, TG2 variants were coexpressed with a hydrophobic elastin-like polymer (ELP) construct encoding for 36 repeats of the pentapeptide VPGXG in which the guest residue X were V and F in ratio 8:1. Protein bodies (PB) were induced by the ELP, with a consequent two-fold-increase in accumulation of both ER-TG2 and vac-TG2. Subsequently, ER-TG2 and vac-TG2 were produced and purified using immobilized metal ion affinity chromatography. Plant purified ER-TG2 and vac-TG2 were recognized by three anti-TG2 monoclonal antibodies that bind different epitopes proving that plant-produced antigen has immunochemical characteristics similar to those of human TG2. Lastly, an ELISA was performed with sera of CD patients and healthy controls. Both vac-TG2 and ER-TG2 were positively recognized by IgA of CD patients while they were not recognized by serum from non-celiac controls. These results confirmed the usefulness of plant-produced TG2 to develop screening assays. In conclusion, the combination of subcellular sorting strategy with co-expression with a PB inducing construct was sufficient to increase TG2 protein yields. This type of approach could be extended to other problematic proteins, highlighting the advantages of plant based production platforms.

Keywords: human tissue transglutaminase, celiac disease, secretory pathway, vacuolar sorting, elastin-like polymer

#### INTRODUCTION

Celiac disease is a chronic disorder caused by the ingestion of prolamins from wheat, barley, rye, or oats, which affects around 1% of the general population (Abadie et al., 2011). Although CD presents extremely heterogeneous clinical spectrum, it can be estimated that only 1 out of 7 patients are actually diagnosed (Rubio-Tapia et al., 2009), therefore massive serological screenings could increase the detection of CD and improve celiac patients' life quality. Among diagnostic serological tests, detection of anti-TG2 IgA has the highest sensitivity and specificity therefore is the single most efficient serological test to screen in risk populations (Husby and Murray, 2014). Cost considerations are important for the implementation of widespread screening programs (Viljamaa et al., 2005), for that reason availability of a low cost and high quality source of TG2 antigen for massive CD serological screening assays is attractive.

Different human TG2 production systems have been assayed such as *Escherichia coli* (Shi et al., 2002), insect cells (Osman et al., 2002), human embryonic kidney cells (Sardy et al., 1999), and plant cells (Sorrentino et al., 2005, 2009). Low recombinant protein yields are generally obtained since TG 2 cross-linking activity has toxic effects on cell growth and development (Griffin et al., 2002). In tobacco Bright Yellow 2 (BY-2) cells, TG2 accumulation was higher when the protein was targeted to the apoplast (apo) than when it was fused to a chloroplast (chl) sorting signal and partial degradation of both apo-TG2 and chl-TG2 was detected (Sorrentino et al., 2005). No transgenic BY-2 clones were obtained for the cytosolic TG2 construct probably due to toxic effect of this enzyme, which might prevent regeneration and growth of the transformed BY-2 cells (Sorrentino et al., 2005). In transgenic tobacco plants, apo-TG2 accumulated at higher levels than the one sorted to the cytosol and chl compartments (Sorrentino et al., 2009). Although plant-produced TG2 was recognized by IgA serum of celiac patients (Sorrentino et al., 2005) no further efforts to produce TG2 in plants were reported. Plants are a cost effective platform for the production high-value recombinant proteins for industrial and clinical applications (Gleba and Giritch, 2014; Makhzoum et al., 2014). Several plant-produced recombinant proteins are commercially available including glucocerebrosidase (the first plant-made biologic approved by the US Food and Drug Administration), veterinary pharmaceuticals, technical enzymes, research reagents, media ingredients, and cosmetic products (Sack et al., 2015). Plant-based platforms are diverse in terms of plant species, cell or organs used for the production and technology used to achieve the over-expression of the gene of interest (Sack et al., 2015). Numerous factors have a profound impact in protein accumulation levels among then protein stability (Egelkrout et al., 2012). Stability can be increased using different subcellular targeting strategies such as accumulating proteins in the apo, ER, vacuoles, chl, on the surface of oil bodies, as well as expression in different organs such as seeds, leaves, and hairy roots (Hood et al., 2014) or fusion to insoluble tags such as ELPs, hydrophobins, and zeins (Floss et al., 2010; Joensuu et al., 2010). Within the endomembrane system, several strategies improve foreign protein accumulation such as ER retention (Wandelt et al., 1992; Fiedler et al., 1997), vacuolar sorting (Stoger et al., 2005; Shaaltiel et al., 2007) and inhibition of apoplast protease activity (Benchabane et al., 2008; Goulet et al., 2012), The results obtained using the subcellular targeting strategies or fusion to insoluble tags depend on the nature of the heterologous protein. These stabilizing strategies have yet not been tested to increase TG2 accumulation in plants.

In this work two strategies were evaluated to increase accumulation TG2 in tobacco leaves. The first one was compartmentalization of TG2 inside the secretory pathway to avoid its cytosolic toxicity and also apoplast degradation. TG2 was fused to a SEKDEL ER retention sequence and also KISIA CT terminal VSS (Petruccelli et al., 2007). The second strategy was the induction of protein body formation in the ER by co-expression of TG2 with a novel, highly hydrophobic elastin-like polymer. TG2 expressed using these strategies was purified and its performance as antigen in serological assays was evaluated.

#### MATERIALS AND METHODS

#### Antibodies

Three anti-TG2 monoclonal antibodies (mAb) named 2G3, 5G7, and 4E1 were produced by Dr. F Chirdo. The mAb recognizes different epitopes: 2G3 (aa 314–329), 5G7 (aa 548–558), and 4E1 (aa 637–648) (Di Niro et al., 2005). Human serum samples were obtained using the conventional procedure for CD diagnosis. Patients signed a written consent. The study was approved by the Ethical Committees of the Hospital Interzonal General de Agudos (HIGA), General San Martin de La Plata, Buenos Aires, Argentina. Celiac patients were diagnosed on the basis of the clinical findings, histological examination, and positive serology. Negative control sera were taken from healthy non-celiac volunteers. CD patients and controls serum samples belongs to sera bank characterized at Instituto de Estudios Inmunológicos y Fisiopatológicos, (IIFP). Other antibodies utilized for this study were mouse anti-GFP antibody (# G1546, Sigma–Aldrich, St. Louis, MO, USA), rabbit anti-RFP antibody (# R10367, Thermo Scientific Pierce, Rockford, IL, USA), goat anti-mouse IgG (H+L) secondary antibody biotin conjugate (# 31802, Thermo Scientific Pierce, Rockford, IL, USA), goat anti-rabbit IgG (H+L) secondary antibody biotin conjugate (# 31820, Thermo Scientific Pierce, Rockford, IL, USA), high sensitivity streptavidin HRP Conjugate (# 21130, Thermo Scientific Pierce, Rockford, IL, USA) and mouse anti-human IgA secondary antibody HRP conjugate (#SA135467, Thermo Scientific Pierce, Rockford, IL, USA).

**Abbreviations:** TG2, tissue transglutaminase 2; CD, Celiac disease; CLSM, confocal laser scanning microscopy; d.p.i., days post-infiltration; ELP, elastin-like polymer; ER, endoplasmic reticulum; IgA, immunoglobulin A; IM, infiltration medium; GFP, green fluorescent protein; mAb, monoclonal antibody; RFP, monomeric red fluorescent protein, RLS, Rubisco large subunit; RT, room temperature; VSS, vacuolar sorting signal; CT-VSS, carboxyl-terminal VSS; ssVSS, sequence specific VSS; TSP, total soluble protein.

# Plants

*Nicotiana benthamiana* plants were grown in a growth chamber at 22◦C 16-h-light/8-h-dark cycles. Six to eight week-old plants were used for each set of experiments and infiltrations were performed in the third and the fourth leaves counting top down starting with the youngest mature leaf.

#### TG2 Constructs

The cDNA encoding TG2 (GenBank Accession Number GI 50593093) from Caco 2 (Human colonic carcinoma) cell line (Bayardo et al., 2012) was amplified with the oligonucleotide primers forward F-SP-TG2 (GTGGGTACCCAATGGCCGAGGA GCTGGTC) and reverse R-TG2HisSal (CCCGTCGACGTGGT GGTGGTGGTGGTGGGCGGGGCCAATGATGAC), designed to place TG2 in frame with the sequence encoding a mouse immunoglobulin heavy chain signal peptide (SP; MGWSWIF LFLLSGAAGGY) from pRTL202 (Restrepo et al., 1990) and to introduce the sequence encoding a six histidine purification tag at the TG2 sequence 3 end. The PCR product was digested with *Kpn* I and *Sal* I and cloned into pRTL-G-KDEL and pRTL-G-KISIA (Petruccelli et al., 2007) to produce p-ER-TG2 and p-vac-TG2, respectively. To generate a secretory version TG2 was amplified with F-SP-TG2 and R-HISSTOP (CCCGTCGACTCAGTGGTGGTGGTGGTGGTGGGC) and cloned into p-secG (Petruccelli et al., 2007). The cassettes CaMV35S promoter::SP-TG2-His (STOP or KDEL)::Nos transcription terminator signal were released from these vectors by digestion with *Hind* III and *Sal* I and subcloned into the binary vector pBLTi-121 (Pagny et al., 2000). To produce the cytosolic version, the SP encoding sequence was removed by releasing TG2 from p-sec-TG2-His with *Kpn* I and *Sal* I and subcloning it into pBLTi 121, digested with the same enzymes.

The vacuolar version of TG2 was amplified with the primers F-SP (CACCATGGGCTGGAGCTGGATC) and R-Ter (CTAGGCGGGGCCAATGATGAC) and the PCR product was directionally cloned into pENTR/D TOPO (Life Technologies, S.A. Buenos Aires, Argentina) to introduce attL1 and attL2 recombination sites and finally was transferred to the binary destination vector pGWB2 (Nakagawa et al., 2007) using LR site specific clonases. To fuse ER-TG2 to a fluorescent protein, the sequence encoding mCherry from ER-Cherry (Nelson et al., 2007) was amplified with F-SP-Cherry (CACCCTCGAGCCGACCTCGACCTAGAAAGAGAAGGAGG ACAGTCCTTCGACGTCCATGGTGAGCAAGGGCGAGGAG) and R-Cherry (TATTAAGCTTGGTACCCAGGTGGACCTGG AGGCCATGCCGCCGGTGGAGTG) and the PCR product was cloned into pENTR/D TOPO. Then the ER-TG2 sequence was released from p-TG2 with *Kpn* I and *Hind* III and introduced into pENTR-SP-RFP, cutted with the same enzymes. Finally the LR recombination reaction was performed between pENTR-SP-ER-RFP-TG2 and pGWB2 (Nakagawa et al., 2007) to obtain ER-RFP-TG2 in a binary vector.

#### Elastin-like Polymer Constructs

A novel synthetic ELP gene encoding for 36 repeats of the pentapeptide VPGXG in which the guest residue X were V and F in ratio 8:1 [V8F1] were purchased to GenScript Corp (Piscataway, USA). This synthetic ELP gene was designed to be expressed into the plant secretory pathway by introduction of the sequences encoding for a mouse immunoglobulin G1 SP (Petruccelli et al., 2006) and the ER retention sequence SEKDEL, upstream and downstream, respectively (Supplementary Figure S1). To facilitate purification the sequence encoding a hexahistidine tag was introduced between the ones encoding ELP[V8F1] and SEKDEL. To allow further multimerization the *Pfl* MI and *Bgl* I restriction sites where incorporate at the beginning and end of the ELP[V8F1] encoding sequence. Codon use was optimized for *Nicotiana benthamiana*<sup>1</sup> . Potential splices sites and inverted repeats were reduced and GC content was adjusted by a GenScript in-house algorithm. ELP constructs was introduced into the plant binary expression vector pEAQ- HT-DEST1 (Sainsbury et al., 2009) using LR site specific clonases (Life Technologies, S.A. Buenos Aires, Argentina).

# *Agrobacterium* Infiltration

*Agrobacterium tumefaciens* strain GV3101 harboring the pGWB2-sec-TG2, pGWB2-cyto-TG2, pGWB2- ER-TG2, pGWB2-vac-TG2, pGWB2-ER-RFP-TG2, or Tomato Bushy Stunt Virus P19 (Voinnet et al., 2003) binary plasmids were grown in YEB media (5 g/L beef-extract, 1 g/L yeast-extract, 5 g/L peptone, 5 g/L sucrose, 2 mM MgSO4) at 28◦C overnight. Cells where then centrifuged at 5,000 × *g* and resuspended in IM [10 mM MgCl2, 10 mM 2-(*N*-morpholino)ethanesulfonic acid (MES) pH 5.7, 200 μM acetosyringone] adjusting agrobacterium OD600 to 0.3 for TG2 constructs, 0.2 for pEAQ1-ELP, 0.1 for ER-GFP (Haseloff et al., 1997), sec-RFP (Scabone et al., 2011), and P19. The bacterial suspensions were incubated at least three hours at 28◦C prior to infiltration. Leaf infiltration was performed manually using disposable, needleless 1 mL syringes with which pressure was applied between ribs at the abaxial face of the leaf.

# Enzyme-linked Immunosorbent Assay (ELISA)

*Nicotiana benthamiana* leaf samples were collected at 5 d.p.i., since maximum TG2 levels were detected at this time in expression kinetics experiment. At least three biological replicates per sample were performed. Each replicate contained five leaf pieces of the infiltrated tissue from different plants. Each sample was analyzed by Enzyme-linked Immunosorbent Assay (ELISA) in triplicate. Leaves were frozen with liquid nitrogen and grounded. The powder was suspended in extraction buffer (20 mM sodium phosphate, 0.5 M sodium chloride pH 7.5) for 15 min at 4◦C. After centrifugation at 10,000 × *g* protein concentration in the supernatant was measured by Bradford assay (Bradford, 1976) using bovine serum albumin as standard. Plastic wells were coated with the same amount of total leaf extract (∼100 μg TSP) or 1 μg of leaf purified TG2 in PBS with 5 mM CaCl2 by passive adsorption at 4◦C overnight (Sulkanen et al., 1998). Plates were then blocked with 5% non-fat milk solution for 1 h at 37◦C. Then 100 μL of a 1:500 dilution of TG2 mAb

<sup>1</sup>http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=4100

2G3 (Di Niro et al., 2005) or a 1:50 diluted pool of 12 patient sera were added and incubated for 16 h at 4◦C as primary antibody. Then biotin-conjugated anti-mouse IgG and HRP-conjugated streptavidin or HRP-conjugated anti-human IgA was applied for 1 h at 37◦C as the secondary antibody. Color was developed with tetramethylbenzidine (TMB)–peroxidase substrate (Kirkegaard and Perry Laboratories, Gaithersburg, MD, USA) and the optical density was measured at 630 nm wavelenght.

## Western Blotting

Total soluble proteins were extracted from *N. benthamiana* agroinfiltrated leaves by grinding 500 mg of leaves in 0.5 mL SDS PAGE sample buffer (72 mM Tris-HCl, 2% SDS, 10% glycerol, 5% β-mercaptoethanol, pH 6.8). The crude extract was boiled for 5 min and centrifuged (20 min, 13,000 rpm, RT). Supernatant samples (20 μL) were separated by electrophoresis on a 10% polyacrylamide gel. For quantitative analysis, the amount of crude extract was adjusted to load the same amount of RLS. After gel electrophoresis, proteins were transferred to nitrocellulose membranes. Blocked membranes (5% nonfat milk solution) were incubated with 1:500 dilution of anti-TG2 mAb 2G3, 5G7, or 4E1 (Di Niro et al., 2005) overnight at 4◦C, followed by incubation with a biotinylated goat anti-mouse IgG antibody (1:20,000), 1 h at 37◦C, and with HRP-conjugated streptavidin (1:20,000) 30 min at 37◦C. Finally, chemiluminescence was generated by addition of 1.25 mM luminol (#A8511 Sigma–Aldrich, St. Louis, MO, USA), 200 μM *p*-coumaric acid (#C9008, Sigma–Aldrich, St. Louis, MO, USA), 0.09% [v/v] H2O2, 0072% [v/v] DMSO, 100 mM Tris-HCl pH 8.5 substrate, and luminescent signal was captured using X-ray film (Amersham Hyperfilm ECL, GE Healthcare Life Sciences, Argentina). The film was scanned and protein band intensity was measured using ImageJ software2 .

# TG2 Purification

Transglutaminase 2 was purified from leaves co-infiltrated with *Agrobacterium* suspensions carrying pGWB2-ER-TG2 or pGWB2- vac-TG2 and pEAQ1-ELP. Tobacco leaves (20 g) were frozen with liquid nitrogen and grounded into a fine powder using mortar and pestle. The powdered tissue was extracted with 20 mL of extraction buffer for 15 min at 4◦C. After centrifugation at 10,000 × *g* the supernatant was incubated for 1 h at 4◦C with 50 μL Ni Sepharose (GE Healthcare Life Sciences, Argentina) and proteins bound to Ni Sepharose were retained using Micro Bio-Spin columns (Bio-Rad, Hercules, CA, USA), washed three times with extraction buffer and TG2 was eluted with 0.2 M NaH2PO4 pH 4.5 and neutralized with NaHCO3. Protease inhibitor cocktail (Roche Applied Science, Mannheim, Germany) was added to the eluted TG2 fraction. Protein concentration was measured using a NanoDrop 2000 UV/Visible Spectrophotometer (Thermo Scientific, Rockford, IL, USA) and purity was analyzed by SDS-PAGE.

# Protein Quantification

Endoplasmic reticulum-green fluorescent protein and sec-RFP quantification was performed by fluorometry of leaf extracts with Synergy microplate reader (Biotek Instruments, Inc., Winooski, VT, USA). Two hundred μL/well of each extract were added to a 96 well black plate. GFP fluorescence was measured by using excitation at 485 nm and emission at 516 nm, and RFP fluorescence by using excitation at 563 nm and emission at 610 nm. Arbitrarily, 1 unit of fluorescence was assigned to each of the samples obtained from leaf in the absence of ELP, and then both with or without ELP samples were normalized to these extracts. TG2 quantification was performed by a calibration curve obtained with the plant purified TG2. To this end, different amounts of purified TG2 were loaded into the gel to make the calibration curve and 20 μL of total leaf extracts to be quantified were loaded on the same gel. After transfer, an immunoblot was performed as describe above. The signal obtained for the 20 μL of total leaf extracts were transformed into ng TG2 using the obtained calibration curve.

#### Confocal Analysis and Image Processing

Abaxial epidermal cells of agroinfiltrated leaves were observed between 3 and 7 d.p.i. with a Confocal Laser Scanning Microscope (CLSM) LEICA TCS SP5 AOBS (Advanced Microscopy Facility, FCE, UNLP), using a 63× oil immersion objective. GFP was excited at 488 nm (Argon 100 mW Laser) and detected in the 496–532 nm range. RFP was excited at 543 nm (HeNe 1.5 mW Laser) and detected in the 570–630 nm range. Simultaneous detection of GFP and RFP was performed by combining the settings indicated above in a sequential scanning set-up, as instructed by the manufacturer. All images shown were acquired using the same photomultiplier gain and offset settings. Post-acquisition image processing was performed with ImageJ software2 .

#### Statistical Analysis

All statistical analyses were carried out using Prism 6 (GraphPad Software, GraphPad Inc., La Jolla, CA, USA). One-way ANOVA test and Tukey's multiple comparisons test were used to determine means with statistical differences. Alternative Student's *t*-test was performed. A *p*-value *<* 0.05 was regarded as statistically significant.

# RESULTS

#### Accumulation of TG2 Fused to Different Sorting Signals

Although TG2 from Caco2 cell line was cloned in *E. coli* expression vector and different conditions were assayed to produce it, low recovery levels were obtained. For that reason in this work, we attempted to produce TG2 in plant cells. To this end, four versions of TG2 in plant expression binary vector were obtained: cytosolic (cyto-TG2), secretory (sec-TG2), ER-TG2, and vacuolar (vac-TG2), and their schematic representations are shown in **Figure 1A**. For vacuolar sorting the C terminal KISIA

<sup>2</sup>http://rsb*.*info*.*nih*.*gov/ij/

FIGURE 1 | Subcellular targeting strategies tested for stabilize TG2 in leaves. (A) Schematic representation of the TG2 constructs used for *Agrobacterium*-mediated transient expression in *Nicotiana benthamiana* leaves. Cyto-TG2 is a cytosolic form of TG2. Sec-TG2, ER-TG2, and Vac-TG2 are introduced in the secretory pathway with murine signal peptide (SP) from gamma 1 antibody chain; SEKDEL, ER retention SP; KISIA is a CT vacuolar targeting signal of the amaranth 11S globulin. Scheme is not drawn to scale. (B) Enzyme-linked Immunosorbent Assay (ELISA) of TG2 fused to the different sorting signals. Microwells were coated with the same amount of total leaves extract overnight at 4◦C. After blocking, anti-TG2 mAb 2G3 was added, followed of incubation with a biotin-conjugated anti-mouse, later with HRP-conjugated streptavidin and developed with TMB peroxidase substrate. Three biological replicates (each replicate containing five leaf disks of the infiltrated tissue from a different plant) were used for ELISA. Error bars represent the standard error of the mean (SEM). ∗∗∗∗Denotes statistically significant difference by Tukey's multiple comparisons test (*P <* 0.001). (C) Western blot of TG2 fused to the different sorting signals. Expression levels were measured by scanning densitometry of Western Blot developed with 2G3 mAb with a minimum of three independent experiments. The amount of total extract was adjusted using RLS stained with Coomassie Brilliant Blue R-250 as loading control.

sequence from the amaranth 11S storage globulin was added to TG2 (Petruccelli et al., 2007). To facilitate purification a six histidine tag was also fused to TG2 (**Figure 1A**). The four TG2 constructs were introduced in *A. tumefaciens* GV3101 and leaves of *N. benthamiana* were infiltrated with these agrobacteria. Five days after infiltration, leaves were collected and accumulation of TG2 was measure by ELISA using anti-TG2 mAb 2G3. **Figure 1B** shows that the highest accumulation level was obtained for ER-TG2 and vac-TG2, and that cyto-TG2 and sec-TG2 levels were approximately 9- to 16-fold lower than those of the ER and vac variants. No significant differences were observed between ER-TG2 and vac-TG2, suggesting that fusion to either SEKDEL or KISIA C-terminal signals is equally efficient to increase TG2 accumulation levels. A kinetic analysis of ER-TG2 and vac-TG2 expression showed that maximum accumulation was reached at 5 d.p.i (Supplementary Figure S2).

Total leaf extracts were also analyzed by Western Blot using mAb 2G3 as detection antibody (**Figure 1C**). Caco2 total extract was also loaded into the gel as positive control. Cyto-TG2 and sec-TG2 were not detected while ER-TG2 (81,4 kDa) and vac-TG2 (81,2 kDa) variants had the expected size suggesting that both forms accumulated in leaves in a stable way. The amount of proteins quantified by immunoblot followed by densitometry analysis showed not significant differences in the accumulation levels of ER-TG2 and vac-TG2 in good correlation with ELISA test.

#### Effect of a Novel Hydrophobic ELP on the Accumulation of TG2

A novel ELP construct consisting in 36 repeats of the pentapeptide VPGXG in which the guest residues X were V and F in ratio 8:1 (Supplementary Figure S1) with a theoretical inverse phase transition temperature (T*t*) of 18◦C (Urry et al., 1992) and which is expected to be insoluble at *N. benthamiana* growing conditions was used in this work. The ELP was sorted to the ER by means of a secretory SP and SEKDEL ER retention sequence. The ability of this ELP to induce protein body formation was analyzed with CLSM, using GFP-HDEL (Haseloff et al., 1997) and sec-RFP (Scabone et al., 2011) as fluorescent markers of the secretory pathway. **Figure 2A** shows that ER-GFP had a normal reticular pattern in the absence of ELP and that sec-RFP localized on the borders of the cell with an irregular pattern typical of apoplast accumulation. ER-RFP-TG2 had also a reticular pattern but its accumulation produced clusters on the borders of the cells (**Figure 2B**, arrows). A partial co-localization was observed between ER-GFP and ER-RFP-TG2 in the merge panel, ER-RFP-TG2 was located mainly in the clusters while ER-GFP had an uniform distribution (**Figure 2B**). Accumulation of ER-RFP-TG2 fusion was approximately 8,4 ± 1,8 μg/g fresh leaf tissue. When ER-GFP and sec-RFP were co-expressed with ELP, large ER-PB were observed predominantly close to the nuclei and in cortical regions (**Figure 2C**). A co-localization pattern of sec-RFP in transit with ER-GFP was found as can be observed in yellow in the merge panel (**Figure 2C**). Nevertheless ELP did not affect final localization of sec-RFP since the apo pattern was also observed for this construct (Supplementary Figure S3). Some of the ELP induced PBs were larger than the nucleolus (**Figure 2C**). When ER-RFP-TG2 was co-expressed with ELP, small (less than 1 μm) and large PBs were also observed (**Figure 2D**) but only a partial co-localization with ER-GFP was detected (**Figure 2D**, merge panel). PBs, in the nuclear region, had heterogeneous size and composition distribution since some of them had only ER-RFP-TG2 and other only

ER-GFP. In contrast, in the cortical region, a complete colocalization of green ER-GFP PBs and red ER-RFP-TG2 PBs was observed (Supplementary Figure S4). The integrity of ER-RFP-TG2 was confirmed by immunoblot analysis to ensure that the red fluorescence corresponded to entire fusion protein (Supplementary Figure S5).

To evaluate the impact of ELP on the accumulation of ER-GFP, sec-RFP, ER-TG2, and vac-TG2 a Western Blot was performed. The same amount of total leaf extracts was load into the gel and as control the amount of RLS in each lane is shown. The intensity of the GFP, RFP, and TG2 bands were quantify as is detailed in Section "Materials and Methods" and the obtained results

(A,B) and 5 μm (C,D).

statistically significant difference by Student's *t*-test (*P <* 0.01).

are shown in **Figure 3**. For ER-GFP a 2.0-fold increase in the accumulation level was observed by ELP induced PB formation, while not significant differences were found in sec-RFP levels (**Figure 3**, upper panel). Accumulation levels of ER-TG2 and vac-TG2 were modified from 9,5 ± 1,5 and 9,9 ± 1,4 to 20,9 ± 2,1 and 24,4 ± 2,3 μg/g fresh leaf tissue, respectively, by expression of ELP (**Figure 3**, lower panel). In conclusion, ELP induced 2.1- and 2.5-fold increase in the accumulation of ER-TG2 and vac-TG2, respectively.

## ER-TG2 and vac-TG2 as Antigen for CD Diagnosis

Endoplasmic reticulum-transglutaminase 2 and vac-TG2 were purified from leaves using immobilized metal ion affinity chromatography. To test their usefulness as antigen their recognition by the mAbs 2G3, 5G7, or 4E1, which recognize different TG2 epitopes, was analyzed by immunoblot. Both vac-TG2 and ER-TG2 were positively recognized by these antibodies as is shown **Figure 4A**, confirming that although in humans TG2 is a localized in the cytosol, the introduction into the plant secretory pathway do not affect the structure of the epitopes recognized by these mAbs. In order to test the performance of the plant purified ER-TG2 and vac-TG2 version in CD screening test an ELISA was performed using a pool of 12 sera of CD patient and control healthy donors (**Figure 4B**). We found that the pool of CD sera recognized both ER-TG2 and vac-TG2 with a large significant difference over the value obtained for the healthy donors. Plant purified TG2 recognition was also assayed by Western Blot (**Figure 4C**) confirming that the full-length ER- and vac-TG2 variants were recognized by CD sera while not recognition occurred for control sera. Therefore the plantproduced ER- and vac-TG2 versions conserved the epitopes recognized by IgA sera of celiac individuals. These results point out the usefulness of plant-produced TG2 for develop CD screening tests.

# DISCUSSION

In this work we showed that TG2 fused to the C terminal KDEL or KISIA sorting signals accumulated at significantly higher levels than the cytosolic and apoplast versions, confirming the convenience of testing different subcellular compartments as strategy to increase accumulation levels. In leaves, the ER is a favorable destination for many proteins such as vicilin, single chain, and full length antibodies, truncated version influenza hemagglutinin (Wandelt et al., 1992; Schouten et al., 1996; Fiedler et al., 1997; Petruccelli et al., 2006; Mortimer et al., 2012) and although fusion to KDEL/HDEL signals not always enhance recombinant protein accumulation, it is frequently used for subcellular targeting strategies (Boothe et al., 2010; Hood et al., 2014). In contrast with ER retention, sorting of foreign proteins to leaf plant central vacuole has been less studied as sorting strategy. The plant vacuole is one of the largest subcellular compartments that storage ions and metabolites (Marty, 1999). Although it is considered a hostile environment for foreign protein accumulation (Hood et al., 2014), some proteins accumulate at high levels in central vacuoles such as glucocerebrosidase in carrot cells (Shaaltiel et al., 2007), IgG in tobacco BY2 cells (Misaki et al., 2011), human alphamannosidase in tobacco leaves (De Marchis et al., 2013), human complement factor C5a in both *N. tabacum* and *N. benthamiana* leaves (Nausch et al., 2012) and human collagen in tobacco leaves (Stein et al., 2009). Other proteins such as human IgG1 and G4 have higher apo yield compared to the accumulation in ER and vacuoles in carrot suspension cell cultures (Shaaltiel et al., 2006). For synthetic spider silk ER-targeted variant was more abundant than the vacuolar variant in *Arabidopsis* leaves (Yang et al., 2005). In opposition to leaf tissues or suspension cultures, there are numerous examples of foreign proteins that stably accumulate in storage vacuoles in seeds (Stoger et al., 2005; Khan et al., 2012).

Although the nature of VSS employed to target foreign proteins to vacuoles might have an impact on protein stability, there are examples of enhanced accumulation for heterologous

proteins fused to different types of VSSs. For example stable deposition was obtained for proteins fused to different CT-VSSs such as tobacco chitinase A CT (DLLVDTM) for glucocerebrosidase (Shaaltiel et al., 2007), phaseolin CT (AFVY) for human complement factor C5a (Nausch et al., 2012), and amaranth 11S globulin CT (KISIA) in this work. Furthermore, ssVSSs had also a positive impact on the accumulation of foreign proteins such as aleurain ssVSS (NPIR) and sporamin ssVSS (NPIRL), which improved build-up of human collagen (Stein et al., 2009) and IgG (Misaki et al., 2011), respectively. Increased vacuolar accumulation was also observed for human alphamannosidase, whose N-terminal sequence sorted it directly to the vacuole bypassing the Golgi apparatus (De Marchis et al., 2013). It is believed that VSSs are necessary for post-Golgi sorting to vacuoles and that the traffic pathway can affect foreign protein stability since the pH varies along the secretory pathway (Neuhaus and Martinoia, 2011; Shen et al., 2013). The different data published for vacuolar sorted foreign proteins, indicate that

patient (Pool+) and normal healthy donors (Pool–). The protein loaded on the gel: ER-TG2, vac-TG2, caco extract is indicated at the top of the immunoblot. reaching a stable accumulation in vacuoles is more dependent on the nature of the foreign protein than the sorting signal used.

In leaves, several fusion tags such as ELP, hydrophobins (HFBI), and N terminal proline-rich region of gamma zein (Zera) improve accumulation levels of recombinant fusion partners (Conley et al., 2011). These three fusion tags are supposed to increase accumulation of recombinant proteins by inducing the formation of leaf PB, that are similar to prolamin PB found in seeds, where recombinant proteins are protected from proteolytic degradation (Conley et al., 2009). Recently, it has been reported that PB formation is not exclusively promoted by the fusion tags and that protein accumulation level is a critical factor to trigger PB formation (Saberianfar et al., 2015). Both ER-GFP and fungal xylanases unfused to these tags are able to induce PB formation when their accumulation levels were higher than 0.2% of TSP (Saberianfar et al., 2015). In this work, we showed that ER-RFP-TG2 induced PB formation on the cortical region of the leaf epidermal cells although protein accumulation level was lower than 0.2% of TSP. Even though protein accumulation level is an important aspect for PB formation, other characteristics of the heterologous protein such as aggregation tendency or recruitment of foldases and chaperone might be also involved in this phenomenon.

A novel highly hydrophobic ELP (VPGXG)36 [where X = V:F in ratio 8:1] with theoretical Tt of 18◦C, insoluble at *N. benthamiana* growing conditions was used in this work. Other synthetic ELPs expressed in plants have VPGVG repeat motif found which is less hydrophobic and has higher Tt (Floss et al., 2010). As was shown here, ELP[V8F1] induced PB formation and increased yields of TG2. Several reports have informed the effect of the fusion of ELP tag to foreign proteins in recombinant protein yields (Floss et al., 2010), but the impact of co-expression of ELP not fused to the protein of interest has scarcely been studied. Here, we showed that the number and size of PBs are increased by ELP[V8F1] co-expression and that induced PBs are heterogeneous since co-localization of ER-RFP-TG2 and ER-GFP was complete in the cortical region, but partial in the nuclear region of the cells. The existence of PBs with distinct composition could be atributed to different PB dynamics taking into account that PBs are highly mobile organelles dependent on actomyosin motility system (Conley et al., 2009, 2011). We showed that co-expression of the hydrophobic ELP[V8F1] increased accumulation of ER-GFP, ER-TG2, and vac-TG2 in 2.0-, 2.1- , and 2.5-fold, respectively. Similar results were obtained for secretory versions of erythropoietin and human interleukin-10 co-infiltrated with GFP-ELP and GFP-Hydrophobin I construct (Saberianfar et al., 2015). However, sec-RFP accumulation levels were not statistically different in the absence and presence of ELP, although formation of PBs and partial retention of sec-RFP inside these organelles was observed. Taken together, our results indicate that the effect of ELP on protein accumulation is dependent on the nature of the protein of interest and its final destination in the cell. The combination of a subcellular targeting strategies and PB induction by ELP co-expression were sufficient to increase TG2 accumulation levels in transient expression assays to allow further purification of both vac-TG2 and ER-TG2 using metal ion affinity chromatography.

Celiac Disease has a high worldwide prevalence and is largely undiagnosed (Garnier-Lengline et al., 2015) since only 1 out of seven patients are actually diagnosed (Rubio-Tapia et al., 2009). In Argentina, the prevalence is as high as in central Europe (Gomez et al., 2001). No massive screening test is performed since available methods based on detection of TG2 autoantibodies are expensive. Human recombinant TG2 is required for high sensitivity and specificity tests since it has superior performance compared to the guinea pig TG2 (Sardy et al., 1999; Rostom et al., 2006). Human TG2 produced in *E. coli* or insect cells is sold at 1,100 and 1,155 Euro/mg, respectively3 . One of the advantages of plant expression system is the low manufacturing cost compared to other expression platform (Tusé et al., 2014).Considering yields of 20 mg/kg and similar cost to the ones reported for other plant-produced proteins (Tusé et al., 2014) for ER-TG2 and vac-TG2, production of TG2 by transient expression in tobacco will be considerably more economic, which would make this antigen more accessible for the development of massive screening tests. Importantly in this work we demonstrated that both ER-TG2 and vac-TG2 were recognized by IgA from peripheral blood of CD patients, and therefore are useful antigens for CD diagnosis. Further studies are under design to scale production of plant TG2 and to develop massive local screening test.

# AUTHOR CONTRIBUTIONS

VV designed and performed experiments and analyzed data. GA built initial ELP constructs, MB cloned TG2 gene. FC and SP designed experiments, analyzed data, and supervised the project. All the authors have contributed significantly to the design, execution, and discussion of the manuscript.

# FUNDING

This research was supported by Agencia Nacional de Promoción Científica y Tecnológica (grants PICT2007-0049 and PICT 2010- 2366 to SP), Universidad Nacional de La Plata, Argentina (grants X630 to SP), and Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina (CONICET) (grant PIP 189).

# ACKNOWLEDGMENTS

We thank Dr. Tsuyoshi Nakagawa (Department of Molecular and Functional Genomics, Center for Integrated Research in Science, Shimane University, Matsue, Japan) for providing pGWB2, Dr. Prof. George Lomonossoff (John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK) for providing pEAQ-DEST1 and Professor David Baulcombe (John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK) for providing P19.

<sup>3</sup>http://zedira.com/Transglutaminases/Transglutaminase-2\_3

SP and FC are researchers from CONICET and Professors of the Facultad de Ciencias Exactas-UNLP; VV and GA are doctoral fellows at CONICET. MB is member of the Support Staff Career (CPA) of CONICET.

#### REFERENCES


## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal*.*frontiersin*.*org/article/10*.*3389/fpls*.*2015*.*01067


Marty, F. (1999). Plant vacuoles. *Plant Cell* 11, 587–600. doi: 10.1105/tpc.11.4.587


management of celiac disease. *Gastroenterology* 131, 1981–2002. doi: 10.1053/j.gastro.2006.10.004


serum of coeliac patients. *Int. J. Biochem. Cell Biol.* 37, 842–851. doi: 10.1016/j.biocel.2004.11.001


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Marín Viegas, Acevedo, Bayardo, Chirdo and Petruccelli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A Decade of Molecular Understanding of Withanolide Biosynthesis and *In vitro* Studies in *Withania somnifera* (L.) Dunal: Prospects and Perspectives for Pathway Engineering

Niha Dhar <sup>1</sup> , Sumeer Razdan<sup>1</sup> , Satiander Rana1 †, Wajid W. Bhat 1 † , Ram Vishwakarma<sup>2</sup> and Surrinder K. Lattoo<sup>1</sup> \*

<sup>1</sup> Plant Biotechnology, CSIR – Indian Institute of Integrative Medicine, Jammu Tawi, India, <sup>2</sup> Medicinal Chemistry, CSIR – Indian Institute of Integrative Medicine, Jammu Tawi, India

Withania somnifera, a multipurpose medicinal plant is a rich reservoir of pharmaceutically active triterpenoids that are steroidal lactones known as withanolides. Though the plant has been well-characterized in terms of phytochemical profiles as well as pharmaceutical activities, limited attempts have been made to decipher the biosynthetic route and identification of key regulatory genes involved in withanolide biosynthesis. This scenario limits biotechnological interventions for enhanced production of bioactive compounds. Nevertheless, recent emergent trends vis-à-vis, the exploration of genomic, transcriptomic, proteomic, metabolomics, and in vitro studies have opened new vistas regarding pathway engineering of withanolide production. During recent years, various strategic pathway genes have been characterized with significant amount of regulatory studies which allude toward development of molecular circuitries for production of key intermediates or end products in heterologous hosts. Another pivotal aspect covering redirection of metabolic flux for channelizing the precursor pool toward enhanced withanolide production has also been attained by deciphering decisive branch point(s) as robust targets for pathway modulation. With these perspectives, the current review provides a detailed overview of various studies undertaken by the authors and collated literature related to molecular and in vitro approaches employed in W. somnifera for understanding various molecular network interactions in entirety.

Keywords: *Withania somnifera*, withanolides, tissue culture, elicitor, medicinal plant, molecular cloning, secondary metabolites, pathway engineering

#### INTRODUCTION

Plants have long been known to allocate substantial resources toward developing chemical solutions to enhance survival strategies in the form of varied natural products. These natural products have been scrupulously used as pharmaceuticals, additives, pesticides, agrochemicals, fragrance and flavor ingredients, food additives, and pesticides. A great majority of these plant derived natural products have long been the basis of many traditional medicines and still they continue

#### *Edited by:*

Rosella Franconi, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### *Reviewed by:*

Antonio Ferrante, Università degli Studi di Milano, Italy Agnieszka Kielbowicz-Matuk, Institute of Plant Genetics Polish Academy of Science, Poland Gianfranco Diretto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy

#### *\*Correspondence:*

Surrinder K. Lattoo sklattoo@iiim.ac.in

#### *†Present Address:*

Satiander Rana, Genetics, Development and Cell Biology, Biorenewables Research Laboratory, NSF Engineering Research Center for Bio-renewable Chemicals, Iowa State University Ames, IA, USA; Wajid W. Bhat, Biotransformation, Scion ResearchNZ-Crown Research Institute, Rotorua, New Zealand

#### *Specialty section:*

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

*Received:* 04 August 2015 *Accepted:* 06 November 2015 *Published:* 27 November 2015

#### *Citation:*

Dhar N, Razdan S, Rana S, Bhat WW, Vishwakarma R and Lattoo SK (2015) A Decade of Molecular Understanding of Withanolide Biosynthesis and In vitro Studies in Withania somnifera (L.) Dunal: Prospects and Perspectives for Pathway Engineering. Front. Plant Sci. 6:1031. doi: 10.3389/fpls.2015.01031 to provide mankind with new remedies. Since time immemorial medicinal plants and their extracts have been used by humans for the treatment of different ailments and diseases. For instance, oils of Cedrus species (Cedar), Cupressus sempervirens (Cypress), Glycyrrhiza glabra (Licorice), Commiphora species (Myrrh), and Papaver somniferum (Poppy juice), all of which are still in use today for the treatment of ailments ranging from coughs and colds to parasitic infections and inflammation. Many of the modern drugs against various ailments are also based on the chemical structures of such plant derived chemical products. During the period of 2005–2007 the Food and Drug Administration introduced 13 new drugs of natural origin into the market and more than 100 natural product-based drugs are in clinical studies (Li and Vederas, 2011). Today, all drugs used in western medicine, around 40–45% are natural products or compounds derived from them, and of these, 25% are obtained from plants. Moreover, the dominant role of natural products like vinca alkaloid derivatives (etoposide, teniposide, etoposide phosphate) acting against cancer (60%) and infectious diseases (75%) is all the more identified (Mander and Liu, 2010). In recent years, from a health perspective, protective dietary constituents in the form of plant derived natural compounds have become progressively significant part of human nutrition research (Pandey and Rizvi, 2009; Choi et al., 2012).

Such vast range of chemical entities in the form of natural products that do not contribute directly in growth and development of a plant are termed as secondary metabolites. Plant secondary metabolites like phenylpropanoids, terpenoids, and alkaloids play significant role in plant survival under specialized ecological conditions, e.g., biotic and abiotic stresses. In contrast with primary metabolites, secondary metabolites are often restricted in distribution and in many instances a specific secondary metabolite is associated with a specific taxonomic groups or a plant species. Medicinal plants have long been the basis of herbal drugs for prevention and treatment of various ailments and secondary metabolites are attributed with the medicinal properties (Croteau et al., 2000; Rao and Ravishankar, 2002). These herbal drugs have thus been in use for thousands of years in different cultures due to their potency, efficacy, low cost, and fewer side-effects. With an ever-increasing global demand for herbal medicine, there is not only requirement for large quantity of raw material of medicinal plants, but also of appropriate quality where active principles are available in desired concentrations (Shahid et al., 2013). Additionally, due to the complex chemical structures, it is often difficult to synthesize complex natural compounds through synthetic chemistry as the whole process is economically prohibitive. Thus, plants remain as a sole sustainable natural resource of many medicinally important secondary metabolites.

The biosynthesis of plant secondary metabolites is tightly regulated by spatial and temporal cues that limit the levels of targeted secondary metabolites in plant tissues (Dhar et al., 2013). Furthermore, many secondary metabolites are often species specific in distribution and the plant species in question may be distributed to specific geographical location. Cumulatively, these issues may limit proper exploitation of plants for large scale production of economically important secondary metabolites. For obvious reasons, a desirable aspect is to improve the level of secondary metabolites in native plant species as well as to develop alternative technologies to produce high value bioactive compounds in microbial or yeast heterologous hosts by using synthetic biology approaches. In this regard, molecular biotechnological interventions and in vitro approaches offer attractive possibilities for metabolic engineering of plant secondary metabolites. However, biogenesis of several important plant secondary metabolites at the level of pathway steps and their regulation is poorly understood. These issues can be attributed to the lack of functional genomics platforms comprising of genome resources, mutants, and transformation systems for medicinal plants. Hence, refinement in tools and techniques to carry out comprehensive studies of medicinal plant secondary metabolism is imperative.

Herein, we provide a comprehensive review of the detailed studies carried out by the authors and other significant contributions collated from the available literature to elucidate different aspects of biosynthesis of steroidal lactone compounds known as withanolides from W. somnifera, a medicinal plant of immense repute. It also entails many important aspects related to enhancement of withanolide production in corroboration with molecular deciphering and in vitro approaches to understand the regulation of withanolide production.

#### *Withania somnifera*

Withania somnifera (Solanaceae) commonly known as ashwagandha or Indian ginseng, is a valued medicinal plant known since antiquity (∼3000 years; Winters, 2006). It is a commended genus described in the Indian Ayurvedic system of medicine and also enlisted as an important herb in Unani and Chinese traditional medicinal systems. It displays an efficient reproductive behavior of mixed mating which enables it to maximize benefits of both selfing as well as outcrossing. Its breeding behavior also guarantees wide chemotypic variability (Lattoo et al., 2007). W. somnifera is widely distributed around the world and is mainly adapted to xeric and drier regions of tropical and subtropical domains, ranging from the Canary Islands, the Mediterranean region and Northern Africa to Southwest Asia (Mirjalili et al., 2009).

Traditionally, W. somnifera is recommended to enhance physiological endurance, overall vitality, strength and general health. It is frequently equated with Korean ginseng (Panax ginseng) for its restorative bioactivities. It also aids to offset impotency, chronic fatigue, weakness, bone weakness, dehydration, premature aging, muscle tension, and emaciation. All parts of the plant, like leaves, stem, flower, root, seeds, and bark are used medicinally. Root of Withania is an important ingredient of more than 200 formulations in traditional systems of medicine like Ayurveda, Siddha, and Unani. These systems are being used for ages in the management of numerous physiological ailments (Sukanya et al., 2010). The bitter leaves of the plant have characteristic odor, used as an antihelmantic and infusion is given in fever. In Ayurveda, berries and tender leaves are prescribed to be applied externally to tumors, tubercular glands, carbuncles, and ulcers (Gauttam and Kalia, 2013).

# WITHANOLIDES

W. somnifera is known to structure a wide-range of low molecular weight secondary metabolites for example terpenoids, flavonoids, tannins, alkaloids, and resins. It has been extensively studied for its chemical constituents that include compounds of diverse chemical structures viz. withanolides, alkaloids, flavonoids, tannin (Elsakka et al., 1990; Attaurrahman et al., 1991; Arshad Jamal et al., 1995; Choudhary et al., 2010). Of these, withanolides are credited with widely acclaimed remedying properties. Withanolides, with as many as 40 reported structures represent a collection of naturally occurring C-28 steroidal lactone triterpenoids assembled on an integral or reorganized ergostane structure, in which C-22 and C-26 are oxidized to form a six-membered lactone ring (Glotter, 1991; Ray and Gupta, 1994). The elementary structure is labeled as the withanolide skeleton chemically nomenclatured as 22-hydroxy ergostane-26 oic acid 26, 22-lactones (Misra et al., 2008). The withanolides are generally polyoxygenated and believed to be produced via enzyme system capable of catalyzing oxidation of all carbon atoms in a steroid nucleus (Kirson et al., 1971; **Figure 1**).

The characteristic feature of withanolides and ergostanetype steroids is one C-8 or C-9 side chain with a either six or five membered lactone or lactol ring. This lactone ring could be attached through a carbon-carbon bond or through an oxygen bridge with the carbocyclic part of the molecule (Kirson et al., 1971; Glotter, 1991). Various kinds of structural rearrangements involving oxygen substituents like bond scission, new bond formation, ring aromatization among others can result in formation of novel structural variants often described as modified withanolides or ergostane type steroids (Misico et al., 2011). They are distributed in distinct amounts and ratios in fruits and vegetative parts of the plant (Sangwan et al., 2008). However, withanolides are mainly localized to leaves, and their concentration is generally low i.e., ranges from 0.001 to 0.5% of dry weight basis (DWB). Many

factors such as growth rate, geographical, and environmental conditions are known to modulate the content of withanolides (Dhar et al., 2013). Apart from Withania, withanolides are also distributed to other Solanaceous genera like Iochroma, Acnistus, Deprea, Datura, Lycium, Dunalis, Nicandra, Jaborosa, Physalis, Salpichroa, Tubocapsicum, Discopodium, Trechonaetes, and Witheringia. Moreover, their distribution is not restricted completely to Solanaceous plants, withanolides have been reported to be isolated from Taccaceae, Leguminosae (Glotter, 1991), Dioscoreaceae (Kim et al., 2011), Myrtaceae, and Lamiaceae families including reports of isolation from marine organisms also (Chao et al., 2011).

#### PHARMACOLOGY

During last two decades, there has been a notable surge in the pharmacological based research of this plant as evidence for antitumor, anti-arthritic, anti-aging, and neuroprotective properties (Budhiraja and Sudhir, 1987; Ray and Gupta, 1994; Jayaprakasam et al., 2003; Choudhary et al., 2005; Kuboyama et al., 2005; Kaileh et al., 2007; Singh et al., 2011) and also positive influences on the endocrine, cardiopulmonary, and central nervous systems have been reported (Mishra et al., 2000). Anabalagan and Sadique (1981) reported efficient anti-inflammatory activity of W. somnifera as compared to hydrocortisone. W. somnifera extracts have revealed anti-inflammatory effects in a range of rheumatological conditions (al-Hindawi et al., 1992). W. somnifera preparations are reported to influence the cholinergic and GABA-ergic neurotransmission (GABA: γ-amino-butyric acid) accounting for various central nervous system related disorders (Kulkarni and George, 1996; Tohda et al., 2005). The active principles, sitoindosides VII–X and withaferin A (WS-3) significantly reduces the lipid peroxidation and increased levels of the superoxide dismutase, catalase, glutathione peroxidase, and ascorbic acid activity, thus possessing a free radical scavenging property (Bhattacharya et al., 1997; Panda and Kar, 1997). Further, normal and cyclophosphamide-treated mice showed enhanced levels of interferon-γ, interleukin-2, and granulocyte macrophage colony stimulating factor by exhibiting immuno-potentiating and myeloprotective effects (Davis and Kuttan, 1999). Further, toxicity studies have shown withanolides to be safer compounds with minute or no related toxicity (Mishra et al., 2000). Withanolides have been noticed to enhance the ability of macrophage to "eat" pathogens (Davis and Kuttan, 2000). WS-3 acts defending in definite types of cancers as an inhibitor of angiogenesis (Mohan et al., 2004). Other studies reveal that withanolides possess potent antimicrobial potential against pathogenic bacteria, including Salmonella, an organism associated with food poisoning (Owais et al., 2005). Withanoside VI and withanolide A (WS-1) facilitate the regeneration of axons and dendrites by reconstructing preand post-synapses in neurodegenerative diseases and preventing pathogenesis and neuronal death. It is found to ameliorate the memory deficit in mice (Kuboyama et al., 2005, 2006). Mechanistically, withanolides act as anti-inflammatory agents by inhibiting lymphocyte proliferation, complement system, land delayed-type hypersensitivity (Rasool and Varalakshmi, 2006). Dhar et al. Molecular Understanding of Witanolide Biosynthesis

WS-3 is a promising agent for the treatment of the inflammatory cascade of cardiovascular diseases as a potent inhibitor of the pro-inflammatory transcription factors (Kaileh et al., 2007). It exhibits in vivo anti-cancer activity against pancreatic cancer by inhibiting Hsp90 chaperone activity and another potential withanolide isolated from roots act as an effective agent to protect against skin carcinoma induced by UV-B (Mathur et al., 2004; Yu et al., 2010). Withanolidesulfoxide have been isolated and identified to inhibit COX-2 enzyme in various tumor cell lines and to suppress their proliferation (Mulabagal et al., 2009). Withanolide D can be used with traditional chemotherapeutic agents as it augments the ceramide accretion by triggering neutral-sphingomyelinase 2, modulate phosphorylation of the JNK and p38MAPK and induced apoptosis in both myeloid and lymphoid cells along with primary cells derived from leukemia patients (Mondal et al., 2010). WS-3 and withanolide D have been demonstrated to hamper angiogenesis, Notch-1, NFκB in cancer cells and trigger apoptosis in breast cancer cells (Kaileh et al., 2007; Koduru et al., 2010; Hahm et al., 2011). The oral administration of withanolides and withanosides reversed behavioral deficits, plaque pathology, accumulation of β-amyloid peptides (Aβ) in the animal models of Alzheimer disease through higher expression of low-density lipoprotein receptor-related protein in brain micro-vessels and the Aβ-degrading protease neprilysin (Sehgal et al., 2012). Recently, WS-3 was found to activate Cdc2 protein in prostate cancer cell lines which result in arrest of the cell cycle and leads to cell death (Roy et al., 2013) and the root extract have been observed to possess neuroprotective effect against β-amyloid and HIV-1Ba-L (clade B) induced neuro-pathogenesis (Kurapati et al., 2013). The other withanolides and alkaloids like withasomine, cuseohygrine, and anahygrine remain to be the promising lead-compounds for the development of the new anti-inflammatory drugs (Mirjalili et al., 2009).

#### BIOSYNTHESIS OF WITHANOLIDES

Chemically, withanolides are 30-carbon compounds called triterpenoids. Triterpenoid backbone, like other terpenoid compounds is biosynthesized by metabolic pathway requiring isoprene units (isopentenylnpyrophosphate; IPP and dimethyl allyl pyrophosphate; DMAPP) as precursors. Therefore, isoprenogenesis could be one of the key upstream metabolic processes governing flux of isoprene units for synthesis of metabolic intermediate(s) of triterpenoid pathway committed to withanolide biosynthesis (Bhat et al., 2012). Dual autonomous pathways for the isoprenoid precursor biosynthesis co-exist in plant cell including the classical cytosolic mevalonic acid (MVA) pathway and the alternative route, plastidial methylerythritol phosphate (MEP) pathway (Newman and Chappell, 1999). Plastidial MEP pathway synthesizes IPP and DMAPP required for production of photosynthesis associated isoprenoids (carotenoids and side chains of chlorophylls, plastoquinones, and phylloquinones) and hormones (gibberellins and abscisic acid). In plants, MVA-derived isoprenoid end products comprise of sterols (modulators of membrane architecture and plant growth and developmental processes), brassinosteroids (steroid hormones), dolichol (involved in protein glycosylation), and the prenyl groups necessary for protein prenylation and cytokinin biosynthesis (Lichtenthaler, 1999; Rodriguez-Concepción et al., 2004). However, now there is growing evidence that a considerable cross-talk between the two pathways of isoprenogenesis exists and exchange of isoprene units may occur at different sub-cellular locations (Chaurasiya et al., 2012). A brief overview of the two metabolic routes for isoprene biosynthesis is elucidated under following sub-sections (**Figure 2**).

#### Mevalonate Pathway

MVA pathway involves seven enzymes for the synthesis of precursor molecules i.e., IPP and DMAPP for terpenoid biosynthesis. First step involves the condensation of two molecules of acetyl-CoA into acetoacetyl (AcAc)-CoA by the enzyme AcAc-CoA thiolase (Vranová et al., 2013) to form 3-hydroxy-3-methylglutaryl-coenzyme (HMG-CoA). In the second step, HMG-CoA synthase facilitate condensation of AcAc-CoA with one molecule of acetyl-CoA to form HMG-CoA (Nagegowda et al., 2004). Subsequently, HMG-CoA reductase (HMGR), a nicotinamide adenine dinucleotide (phosphate) dependent (NAD(P)H) enzyme that catalyzes a double reduction reaction involving four electron transfers, results in biosynthesis of mevalonate from HMG-CoA (Benveniste, 2002). Conversion of mevalonate to IPP encompass two phosphorylations and decarboxylation events involving mevalonate kinase, phosphomevalonate kinase, and mevalonate diphosphate decarboxylase enzymes, respectively. Further, IPP derived from cytosolic MVA pathway, is acted upon by isopentenyl diphosphate isomerase, a divalent, metal ion-requiring enzyme, to form DMAPP (Hunter, 2007).

#### Methylerythritol Phosphate Pathway

First step of the MEP pathway is catalyzed by 1-deoxy-Dxylulose 5-phosphate synthase (DXS), converting the precursors pyruvate and glyceraldehyde 3-phosphate into 1-deoxy-Dxylulose 5-phosphate (DXP; Sprenger et al., 1997; Cordoba et al., 2009). DXP reductoisomerase transforms DXP into MEP which is further converted to 1-hydroxy-2-methyl-2-(E) butenyl 4-diphosphate by the consecutive enzymatic action of 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase, 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase, 2-Cmethyl-D-erythritol 2,4-cyclodiphosphate synthase, and (E)-4-hydroxy-3-methylbut-2-enyl diphosphate synthase (HMBPP). The last step is the branching of HMBPP to IPP and DMAPP catalyzed by the simultaneous enzymatic action of a single enzyme, (E)-4-hydroxy-3-methyl but-2-enyl diphosphate reductase (HDR). Although, HDR in MEP pathway produces both IPP and DMAPP, albeit at 85:15 ratio, the plastid localized isopentenyl diphosphate isomerase is involved in substrate optimization by catalyzing IPP isomerization. The head-to-tail condensation of IPP leads to formation of farnesyl pyrophosphate (FPP). FPP is the main precursor for triterpenoids (Kuzuyama, 2002; Chaurasiya et al., 2012) and is synthesized by catalytic action of the enzyme farnesyl diphosphate synthase (FPPS). It serves as a substrate for first committed reaction of

FIGURE 2 | An overview of putative withanolide biosynthesic pathway. DXP, 1-deoxy-D-xylulose 5-phosphate; HMBDP, 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate; IPP, Isopentylpyrophosphate; DMPP, Dimethylalyl diphosphate; IPP isomerase, Isopentylpyrophosphate isomerase; FPPS, farnesyldiphosphate synthase; SQS, Squalene synthase; SQE/CPR, Squalene epoxidase/cytochrome P450 reductase; CAS, Cycloartenol synthase; SMT-1, Sterol methyl transferase/cytochrome P450 reductase; ODM/CPR, Obtusifoliol-14-demethylase/cytochrome P450 reductase. First three highlighted (yellow) steps indicating involvement of P450 monooxygenases and CPR. Single dark arrows represent one step, two or more dark arrows represent multiple steps and dashed arrow represents unknown steps.

several branched pathways leading to synthesis of compounds that are essential for plant growth and development as well as of pharmaceutical interest (Newman and Chappell, 1999). FPPS catalyzed reaction occurs in two consecutive steps; condensation of IPP with DMAPP to form 10-C intermediate geranyl diphosphate (GPP) and condensation of GPP with another molecule of IPP which results into 15-C FPP (Ohnuma et al., 1996). Squalene is believed to be a metabolic intermediate for biosynthesis of diverse triterpenoids. Its biosynthesis takes place by a reaction requiring squalene synthase enzyme that catalyzes head-to-head condensation of two molecules of FPP in NADPH dependent manner to produce squalene. The squalene undergoes epoxidation at one of its terminal double bonds by squalene epoxidase yielding squalene 2,3-epoxide. A ring closure reaction acting upon squalene 2,3-epoxide, catalyzed by cycloartenol synthase enzyme leads to the biosynthesis of cycloartenol that may further get converted into a variety of different steroidal triterpenoidal skeletons (Zhao et al., 2013; Dhar et al., 2014). The 24-methylenecholestrol, believed to be biosynthesized from cycloartenol has been proposed to be a central intermediate in the metabolic route toward withanolide biosynthesis. The hydroxylation at C-22 and δ-lactonization between C-22 and C-26 of 24-methylenecholestrol are believed to be important reactions leading to withanolides biosynthesis (**Figure 2**). In addition, it has also been suggested that α, β-unsaturated ketone in ring A of common withanolides may be produced through the sequential reactions. There is a scanty understanding with respect to enzymes and genes involved in these downstream reactions and withanolide biosynthesis.

#### *De novo* TISSUE-SPECIFIC WITHANOLIDE BIOSYNTHESIS

Phytochemical data generated over the years support considerable qualitative overlap of leaf and root withanolides in W. somnifera. Gradient of withanolide concentration with higher in leaves and lower in roots and radiotracer studies using 24-methylene cholesterol as a precursor also hints toward a possible import of withanolides from leaves to roots. Nevertheless, investigations reveal incorporation of 14C from [2–14C]-acetate and [U-14C]-glucose into WS-1 in in vitro cultured roots and native/orphan roots of W. somnifera. This study showed these primary metabolites being integrated into WS-1, thus indicative of de novo synthesis of root-specific WS-1 from primary isoprenogenic precursors rather than hinting toward an import from leaves (Sangwan et al., 2007).

Another interesting study reveals the analogy in the qualitative and quantitative profile of withanolide accumulation in leaf and root tissues of two morpho-chemovariants, suggesting de novo tissue-specific withanolide biosynthesis. Two genetic stocks designated as WS-Y-08 (25-30 cm tall with yellow berries) and WS-R-06 (100–125 cm tall with red berries) showed appreciable variation in their competence to synthesize and accumulate different withanolides. Three major withanolides viz. WS-1, withanone (WS-2), and WS-3 were assayed from leaf and root tissues harvested at five developmental stages. Additionally, transcript profiles of five withanolide biosynthetic pathway genes namely squalene synthase (WsSQS; GenBank Accession Number GU474427), squalene epoxidase (WsSQE; GenBank Accession Number GU574803), cycloartenol synthase (WsOSC/CS; GenBank Accession Number HM037907), cytochrome P450 reductase 1 (WsCPR1; GenBank Accession Number HM036710), cytochrome P450 reductase 2 (WsCPR2; GenBank Accession Number GU808569; **Table 1**) were also examined in the harvested tissues. The aim was to compare gene expression with that of metabolite flux at different phenophases. Relative transcript abundance demonstrated significant deviation in leaf and root tissues that was mostly parallel with the divergence in withanolide pool. Leaves in comparison to roots showed elevated gene expression in corroboration with improved concentration of all the three withanolides. This relative dynamics of all the three withanolides at quantitative and qualitative levels in the two withanolide richest tissues possibly indicate toward de novo tissue-specific biosynthesis (Dhar et al., 2013).

# GENE ELUCIDATION OF WITHANOLIDE BIOSYNTHETIC PATHWAY

Synthesis of withanolides chemically, is complex due to the stereo-chemical ring closure, occurrence of chiral centers, rigid trans-lactone groups, and high energy epoxy ring (Neumann et al., 2008). Thus, making synthetic production economically unworkable due to minimal yields at high costs. Therefore, it demands viable alternatives for the production of withanolides in large quantities for commercial exploitation. Genetic manipulation with genes encoding enzymes involved in withanolide biosynthesis seems to be a viable approach that may be useful in developing genotypes of W. somnifera with enhanced levels of withanolides as well as for withanolide producing alternative microbial/yeast hosts. This whole approach entails expression of metabolic circuitries in heterologous host(s). It requires comprehensive knowledge about complete genetic architecture of withanolide biosynthesis including enzymatic and regulatory genes of the pathway. Nevertheless, withanolide biosynthetic pathway still remains in its putative stage at molecular level. Hence, it is of fundamental research value to elucidate the withanolide biosynthetic pathway encompassing the delineation of regulatory aspects of their biosynthesis also. Inception of pathway exploration has gained much interest by researchers over the past few years vis-à-vis withanolides. There have been several endeavors by many workers including comprehensive investigations carried out by our group related to isolation, cloning, characterization and regulation of several pathway genes, promoters, elicitations in corroboration with metabolite production and substrate pool diversion for enhanced withanolide yields. These different aspects are covered in the ensuing text.

## 3-HYDROXY-3-METHYLGLUTARYL COENZYME A REDUCTASE

3-Hydroxy-3-methylglutaryl coenzyme A reductase (3-HMGR) is a NADH dependent rate limiting enzyme involved in the

#### TABLE 1 | Putative pathway genes from *Withania somnifera.*


conversion of 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) into mevalonic acid; chief precursor of IPP and DMAPP in the isoprenoid biosynthetic pathway. Plant HMGR have been located in the subcellular organelles like endoplasmic reticulum (ER), mitochondria, and plastids with catalytic domains present in cytosolic portions of cell (Kim et al., 2015). Owing to its rate limiting nature this enzyme is a target of various cholesterol lowering drugs such as statins (Istvan and Deisenhofer, 2001). Functionally it has been characterized in both mammalian and plant systems. In plants HMGR is encoded by multigene family which exhibits variegated temporal and spatial expression pattern. In Arabidopsis HMGR isozymes are encoded by HMGR-1 and HMGR-2. Loss of function of HMGR-1 leads to the generation of dwarf phenotype, sterility and senescence, corresponding to the diminished sterol biosynthesis (Kim et al., 2015). Considering its important role in isoprenoid biosynthesis HMGR has been explicitly studied and characterized in many plant and other plant species which include Catharanthus roseus, Ginkgo biloba, Taxus media, Salvia miltiorrhiza, etc. (Akhtar et al., 2013).

Studies carried out have shown the existence of a relationship between withanolide biosynthesis and HMGR-1 expression. In this study a positive correlation between high transcript levels of HMGR and optimum accumulation of withanolides was found in the root tissue of W. somnifera (**Table 1**; Akhtar et al., 2013). It may be attributed to the enhanced biosynthesis of substrate pool or precursors like IPP and DMAPP for various biosynthetic pathways including withanolide biosynthetic pathway. Also reports of HMGR-1 mutants generating diminished sterol content in A. thaliana and mevinolin directed inhibition of HMGR leading to significant decrease in total ginsenoside in P. ginseng adventitious roots (Kim et al., 2015) suggest a link between HMGR-1 expression and sterol biosynthesis. Positive elicitation i.e. increased expression of WsHMGR in response to salicylic acid (SA) and methyl jasmonate (MJ) indicates presence of cis regulatory elements in promoter region which may regulate the expression of WsHMGR in various biosynthetic pathways including withanolide biosynthetic pathway. Young leaves expressed high levels of WsHMGR transcripts than in mature leaves (Chaurasiya et al., 2007). These results correlated positively with the enhanced levels of withanolide production in young leaves relative to that of mature leaves of W. somnifera. HMGR has been demonstrated as an accelerator of isoprenoid biosynthesis. There was two-fold increase in the biosynthesis of β-carotene in E. coli (Akhtar et al., 2013) by tandem expression of WsHMGR and PAC Beta gene. It suggested that HMGR provides enhanced progenitor substrate pool for various biosynthetic pathways including withanolide biosynthesis.

#### 1-DEOXY-D-XYLULOSE-5-PHOSPHATE REDUCTOISOMERASE AND 1-DEOXY-D-XYLULOSE-5-PHOSPHATE SYNTHASE

Plausibly, withanolides, the signature secondary metabolites of W. somnifera, are biosynthesized through metabolic deviation from sterol pathway at the level of 24-methylene cholesterol (Sangwan et al., 2004). Isoprenoid precursor for the same is synthesized by MVA pathway and MEP pathway wherein MEP pathway is the plastid-derived alternative route for isoprenoid biosynthesis. MEP pathway contributes about 30% in biosynthesis of withanolide precursor isoprenoids (Tuli et al., 2009) in which the first step is a condensation of pyruvate with Dglyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP), catalyzed by DXP synthase (DXS). DXP acts as a precursor for IPP and DMAPP biosynthesis (Julliard and Douce, 1991; Julliard, 1992; Himmeldirk et al., 1996). Subsequently, conversion of DXP to MEP, is catalyzed by DXP reductoisomerase (DXR). Though, DXR is the opening committed step for terpenoid biosynthesis through the MEP, DXS, the first enzyme of this pathway, also is significant for isoprenoid biosynthesis in several organisms, including bacteria and plants (Estévez et al., 2001; Guevara-García et al., 2005). To understand the significance of MEP pathway in isoprenoid biosynthesis in Withania, full-length cDNAs of WsDXS and WsDXR were cloned and characterized (**Table 1**). Saptial expression analysis revealed elevated level of WsDXS and WsDXR transcripts in young leaf that correlates with the reported enhanced rates of withanolide biosynthesis in young than in mature leaf of W. somnifera (Chaurasiya et al., 2007). Lower root expression of WsDXS and WsDXR than leaf also hinted toward their plastid-localization. This hints toward root being less active site for isoprenoid biosynthesis utilizing substrate from MEP pathway. Though, leaves and roots both are involved in withanogenesis independently, leaves are possibly the prime site for the same. Further, gene expression level of WsDXR and WsDXS were corroborated with qualitative and quantitative withanolide variations in chemotypes NMITLI-101, NMITLI-118, and NMITLI-135 possessing WS-3, WS-2, and withanolide D as the main withanolides in leaf tissue and withanolide A as the main withanolide in the root tissue. WsDXS expressed maximally in leaf of NMITLI-118 and 135 which was concurrent with high accretion of withanolides in these chemotypes. Conversely, expression of WsDXR transcript was observed approximately equivalent in leaves of all three chemotypes. Consequently, indicating that enzymes contribute in similar manner during isoprenoid biosynthesis in different chemotypes (Gupta et al., 2013a).

# FARNESYL DIPHOSPHATE

MVA and MEP pathway are attributed with most of the bioactive molecules synthesized in Withania. In these biosynthetic routes, farnesyl diphosphate (FPP), acts as a substrate for foremost committed reaction of numerous branched pathways and is synthesized by the enzyme farnesyl diphosphate synthase (FPPS) in two successive steps. Firstly, condensation of IPP with DMAPP structures 10-C intermediate geranyl diphosphate (GPP). Further condensation of GPP with another molecule of IPP forms FPP. FPPS is an important enzyme for biosynthesis of isoprenoid that synthesize sesquiterpene precursors for vital metabolites including sterols, dolichols, ubiquinones, and carotenoids in addition to substrates for farnesylation and geranylgeranylation of proteins. Overexpression of ginseng farnesyl diphosphate synthase in Centella asiatica hairy roots also enhanced phytosterol and triterpene biosynthesis (Kim et al., 2010). FPPS as well caters an important role in incipient steps of triterpenoid precursor production related to withanolide biosynthesis. Consequently, highlighting the significance of FPPS in any pathway engineering attempt for enhancing a desired isoprenoid of primary or secondary importance. FPPS has been characterized from a range of different plant species like Arabidopsis (Closa et al., 2010), Artemisia (Matsushita et al., 1996), Hevea (Takaya et al., 2003), maize (Cervantes-Cervantes et al., 2006), etc. As a step toward elucidating the significance of FPPS as the key entry point enzyme of the withanolide biosynthesis in W. somnifera, full-length FPPS cDNA was isolated and characterized as it constitutes a key step en route to biosynthesis of the progenitor(s) of withanolide biosynthesis (**Table 1**). Significance of WsFPPS gene in synthesis of sesquiand higher isoprenoids counting the metabolites obtained from them was displayed by the constitutive expression of WsFPPS in all parts of the plant. Higher expression level of WsFPPS in young leaf as compared to the mature leaves corroborated with the reported enhanced withanolide biosynthesis in young leaf of W. somnifera (Chaurasiya et al., 2007; Gupta et al., 2011).

#### SQUALENE SYNTHASE

Squalene synthase (SQS) (farnesyl diphosphate: farnesyl diphosphate farnesyl transferase, EC 2.5.1.21) catalyzes one of the initial enzymatic steps of phytosterol biosynthetic pathway, facilitating condensation of two farnesyl pyrophosphate molecules to squalene. SQS routes carbon flux from isoprenoid pathway toward the phytosterol biosynthesis resulting in formation of endproducts like brassinosteroids, withanolides, and triterpenoids (Abe et al., 1993). SQS has been reported to be active in ER, it anchors to it via carboxyterminal portion. The cytosolic portion is anchored via amino terminal of protein (Robinson et al., 1993). SQS plays a key regulatory function in phytosterol biosynthesis. Overexpression of SQS genes in P. ginseng (Lee et al., 2004) and Eleutherococcus senticosus (Seo et al., 2005) led to the improved accretion of phytosterols and triterpenes thus highlighting the significant regulatory function of SQS in plants. Although ample evidence is available regarding the role of SQS in phytosterol biosynthesis, scanty is identified about the biosynthetic pathway of withanolides, genes involved in the withanolide biosynthesis and regulatory elements of promoter region governing the gene expression in W. somnifera. Thus, for pathway intensification leading to enhancement of withanolide accumulation in W. somnifera Bhat et al. (2012) investigated the significance of squalene synthase in withanolide biosynthesis. Characterization of WsSQS including tissue specific expression analysis and regulatory studies at promoter level substantiated WsSQS as an imperative gene target involved in withanolide biosynthesis (**Table 1**). WsSQS demonstrated increased expression pattern in leaves that was in consonance with the elevated production of withanolides in leaves of W. somnifera. Additionally, biosynthesis of withanolides and mRNA abundance of WsSQS were enhanced through diverse signaling molecules including methyl-jasmonate, salicylic acid, and 2,4-D that was regular with the expected results of WsSQS promoter. Thus, hinting toward the unraveling of a key committed step of the withanolide biosynthetic pathway (Bhat et al., 2012).

# SQUALENE EPOXIDASE

Squalene epoxidase (SE) (EC 1.14.99.7) is a rate limiting enzyme in the sterol biosynthetic pathway, catalyzing the conversion of squalene into 2,3-oxidosqualene by carrying out stereospecific epoxidation reaction (Ryder, 1992; He et al., 2008). This enzyme requires cytosolic (S105) fraction, molecular oxygen, NADPH-cytochrome c reductase, NADPH and flavine adenine dinucleotide (FAD) for its activity (Abe et al., 2007). SQE, a lightly bound FAD flavin, attains electrons from NADPH-cytochrome reductase, instead of binding the nicotinamide cofactor directly which differentiates it from other flavin mono-oxygenases. It is mainly located in the endoplasmic reticulum and lipid droplets but protein located in the endoplasmic reticulum is active (Leber et al., 1998). Additionally, SQE activity can result in the formation of 6,7-oxidosqualene, 10,11-oxidosqualene, and dioxidosqualene (Bai and Prestwich, 1992). SQE has been found to be main precursor for all identified angiosperm cyclic triterpenoids, that comprise membrane sterols, non-steroidal triterpenoids, brassinosteroid, and phytohormones. Being a rate limiting enzyme SQE has a cascading influence on the upregulation of downstream genes (Han et al., 2010). Thus, genetic manipulation of SQE in host plant offers an exciting prospect for production of desired therapeutic triterpenoid molecules (Takemura et al., 2010). This has been demonstrated in P. ginseng for heightened synthesis of triterpene saponins and phytosterols using squalene synthase (Lee et al., 2004). Against this backdrop, Razdan et al. (2013) has reported the substantial notice to comprehend the regulatory function of SE in withanolides biosynthesis. Toward this goal, WsSQE gene along with its promoter was isolated from W. somnifera and several cis-regulatory elements of promoter region were revealed (**Table 1**). This paves a way to recognize the regulatory function of SQE in withanolides biosynthesis as WsSQE also displayed maximum expression in withanolide richest leaf tissue. Keeping in view the prospect of pathway intensification, significance of WsSQE as a robust target lies in its rate limiting nature (Razdan et al., 2013). This significance can be utilized with an efficient Agrobacterium mediated transformation system in W. somnifera for homologous modulation of withanolide biosynthesis.

# CYCLOARTENOL SYNTHASE

Withanolides are synthesized via both MVA and MEP pathways which direct the flux of the isoprene (C5) units for the synthesis of triterpenoid pathway intermediates which are further committed to withanolide biosynthesis. Sterols, withanolides, and various triterpenoids are synthesized through a common 30-carbon intermediate 2,3-oxidosqualene in a highly regio and stereo-specific step catalyzed by a family of genes called oxidosqualene cyclases (OSCs) (Phillips et al., 2006). Plants produce a variety of triterpenoid skeletons structured by numerous OSC enzymes broadly belonging to two groups i.e., protosteryl and dammarenyl cations based on the nature of their supposed catalytic intermediates (Phillips et al., 2006). Both these cations as backbones impart discrete stereochemistry and ring configurations to various triterpenes. The protosteryl cation with chair-boat-chair (C-B-C) configuration forms cycloartenol, lanosterol, cucurbitadienol, and parkeol tetracyclic triterpene structures. The majority of the pentacyclic triterpenes are however, derived from the dammarenyl cation by D-ring expansion to form lupeol or further E-ring expansion to form β-amyrin (Xu et al., 2004). Similarly, partitioning of the common substrate, 2,3-oxidosqualene in W. somnifera takes place between OCS-cycloartenol synthase [(S)-2,3-epoxysqualene mutase (cyclizing, cycloartenol forming), EC 5.4.99.8] (CAS) and other OSCs. CAS forms cycloartenol, a pentacyclic triterpene with nine chiral centers and functions as the precursor to phytosterols and apparently to withanolides and other diverse OSCs structure diverse triterpenoids like lupeol, beta amyrin, etc., (Rees et al., 1969). This partitioning constitutes a metabolic branching point leading to the division of 2,3-oxidosqualene between sterol/withanolides and range of triterpenoids (**Figure 1**). Thus, making genes covering the branches of these sub-dividing point prospective candidates for perturbation. Such manipulations hold significant possibility of impacting respective branch flux by redirecting the precursor reservior in the direction of preferred secondary compound and concurrently reduce the flux via competitive biosynthetic routes. Against this backdrop, in W. somnifera three members of OSC superfamily viz. β-amyrin synthase (WsOSC/BS; GenBank Accession Number JQ728553), lupeol synthase (WsOSC/LS; GenBank Accession Number JQ728552), and WsOSC/CS covering three branches of a subdividing junction leading to withanolides, sterols and a suite of triterpenoids have been characterized (**Table 1**). Regulatory studies of WsOSCs involving plant-derived methyl jasmonate and giberrelic acid and microbe-derived yeast extract elicitations displayed differential transcriptional and translational profiles that were clearly reflected in visible variations in withanolide quantity. MJ elicitation considerably augmented WS-3 accretion over a period of 48 h that was in consonance with studies involving MJ-induced up-regulation of WsSQS, WsSQE, and WsCPR2 mRNA also led to enhanced withanolide accumulation. It may be attributed to increased synthesis of 2,3-oxidosqualene produced by induced upstream genes. As a consequence, WsOSC/CS is able to utilize an increased precursor pool for withanolide biosynthesis. Although the OSC mRNA expression model in case of gibberellic acid (GA3) coincided with MJ treatment, the total withanolide accumulation demonstrated a regular drop with increasing time course. This may be attributed mainly to the decrease in WsOSC/CS protein concentration as evident from the Western blot study. Nevertheless, transcript abundance of WsOSC/BS showed a rise that hinted toward the decrease in the total substrate availability for WsOSC/CS, but at the protein level, WsOSC/BS expression declined with increasing time intervals, thus possibly substantiating the drop in WS-3 concentration caused by decreased WsOSC/CS protein availability. Interestingly, microbe-derived exogenous yeast extract (YE) elicitor played a role of negative regulator for the two competitive OSCs of WsOSC/CS (WsOSC/BS and WsOSC/LS) at both the protein and mRNA levels, whereas WsOSC/CS showed no change in its transcript or protein expression in response to YE. However, there was significant increase in withanolide concentration with YE in comparison with MJ treatment. The down-regulation of WsOSC/BS and WsOSC/LS is possibly indicative of differential channeling of common substrate among the three branch OSCs. Plausibly, this leads to rearrangement of metabolic fluxes wherein bulk of 2,3-oxidosqualene substrate pool shifts toward WsOSC/CS, leading to much improved withanolide yields. The characterization and validation of WsOSCs seem important for strategizing the enhanced production of withanolides (Dhar et al., 2014).

# CYTOCHROME P450 REDUCTASE AND MONOOXYGENASES

Cytochrome P450 enzymes, member of one of the key functionally diverse protein super-families. It is essential in a variety of metabolic molecular circuitries. P450s are heme thiolate-proteins, catalyse enormously varied reactions like hydroxylations, dealkylations, sulfoxidations, epoxidations, reductive dehalogenations, peroxidations, and different types of isomerization for the synthesis of a number of primary and secondary metabolites indispensable for plant growth and development (Guengerich, 2001a; Hrycay and Bandiera, 2012). P450 monooxygenases comprise a substrate explicit class of enzymes which are highly regio and stereo-specific. Gene annotation has shown that about 1% of the entire genes in the plant's genome are cytochrome P450s. Arabidopsis genome contains 244 genes and 28 pseudo-gene representing cytochrome P450s (Nelson et al., 2004). P450s are ER localized with their catalytic functioning depending on source of electrons through NADPH cytochrome P450 reductase (CPR: diflavoenzyme).

CPRs (EC 1.6.2.4) possess a N-terminal positioned flavin mononucleotide (FMN) binding domain linked to NADPH binding domain via flavin adenine dinucleotide (FAD) domain are membrane bound proteins localized to ER. These are responsible for shuttling electrons obtained from NADPH through FAD and FMN domains into the heme iron-center of the various P450 enzymes. CPR genes have been isolated from numerous species of yeast, animals and insects. Between all, only one form is identified to network with several P450s (Simmons et al., 1985). Conversely, many CPR paralogs varies based on the vascular plant species. Ro et al. (2002) have categorized CPRs into Class I and class II groups on the basis of N-terminal anchoring sequences. CPR1 belonging to class I expresses constitutively whereas CPR2 in class II, is expressed in stress or on wound elicitation. Plants encode several CPRs that reflects the range of P450s (Werck-Reichhart et al., 2000; Feldmann, 2001) and their role in primarily confronting the elevated requirement of electrons during various stresses or varied expression at different plant developmental stages (Mizutani and Ohta, 2010).

Additionally, P450 monooxygenases also correspond to a highly regio and stereo-specific class of fixed substrate-specific enzymes that play a decisive function in secondary metabolism and mostly aid in functionalizing core structures of molecules like withanolides. Due to their regio and stereo-specific catalyzing

flexibility, these are possible targets for industrial biocatalysis. P450s have been useful in industry for the examination of new medicine, drugs or xenobiotics (Guengerich, 2002, 2011; Miners, 2002). P450s are considered as the most versatile biological catalysts in nature due to the notable diversity of chemical reactions catalyzed and vast substrates attacked (Sono et al., 1996; Guengerich, 2001b; Coon, 2005). Consequently, accentuating their identification and characterization for biosynthetic pathway elucidation. Due to the polyphyletic nature of plant P450s these are commonly categorized in two main clades, A-type and non-A-type. Plant specific metabolism and biosynthesis of diverse natural products involves A-type P450s (Bak et al., 2011). The molecular and biochemical characteristics of cytochrome P450 reductases and monooxygenases in relation to pathway engineering, emphasizes the significance of this family of genes as robust gene targets of both primary and secondary metabolite biosynthesis.

Molecular and biochemical studies started to reveal biosynthetic routes for several withanolides in W. somnifera encompasses this important family of genes. Two A-types P450 WsCYP98A (GenBank Accession Number HM585369) and WsCYP76A (GenBank Accession Number KC008573) and two paralogs of cytochrome P450 reductases from W. somnifera have been isolated, sequenced and heterologously expressed in E. coli (Rana et al., 2013, 2014) (**Table 1**). All the four CYPs at transcript level are spatially regulated displaying variance in tissue specificity. Expression of WsCPR2 is coincident with the elevated withanolides content in W. somnifera leaves. This probably indicates involvement of WsCPR2 to confront increased reductive demand of diverse P450 monooxygenases for carrying the withanolides biosynthesis. Elicitation studies showed exogenous elicitors acting as both positive and negative regulators of mRNA transcripts of Ws monooxygenases and reductases. MJ and SA resulted in abundant WsCYP98A and WsCYP76A expression. Increased mRNA levels also agreed with the high accumulation of elicitation driven withanolides biosynthesis. There was appreciable enhancement in WS-1 and WS-3 in response to elicitations. In MJ treated samples, there was a 2.5-fold increase in WS-1 and a significant 4.2-fold enhancement of WS-3. SA treated samples showed marked increase of 1.7- and 3.2-fold in WS-1 and WS-3, respectively. Conversely, GA<sup>3</sup> decreased the expression of WsCYP98A as WsCYP76A in addition to the gradual decline in WS-3 whereas WS-1 showed an increase up to 24 h followed by a decrease in WS-1 after 48 h (Rana et al., 2014). Whereas, MJ and SA elicitation induced only WsCPR2 reductase while as WsCPR1 expression showed no change along with significant increase in WS-1 and WS-3 (Rana et al., 2013).

Additionally, four more CYP genes from W. somnifera christened as WSCYP93Id, WSCYP93Sm, WSCYP734B, and WSCYP734 belonging to CYP83B1 and CYP734A1 family and CYP71 and CYP72 clans, also displayed variance in expression in different tissues and in response to different treatments. Interestingly, all the four CYPs showed maximal expression in leaf tissue that was co-incident with the tissue specific secondary metabolite profile of W. somnifera. Furthermore, similar expression profile of WSCYP93Id, WSCYP93Sm, WSCYP734B and WSCYP734 in different W. somnifera chemotypes hinted toward their specialized role in biosynthesis of chemotypespecific metabolites. Light and auxin led to enhancement in expression of all the four CYPs. However, WSCYP734B displayed predominant responsiveness to light and auxin proposing its association with withasteroid/brassinosteroid regulation in planta. MJ and SA elicitations showed an increasing m-RNA abundance trend of the CYPs with increasing concentration of the elicitor. Functional validation of WSCYP93Id in E. coli using withanolides as substrates revealed change of withaferinA to a hydroxylated product. Relationship chart drawn for WSCYP93Id, WSCYP93Sm, WSCYP734B, and WSCYP734 sequences proposed that various withanolides like withanolide A, withanolide D, withaferin A, and withanone are possibly the ensuing yields of metabolic changes involving downstream biosynthetic genes by means of WSCYP enzymes (Srivastava et al., 2015).

Likewise, these results cumulatively, give a better understanding of the regulatory role of CPRs for increased production of withanolides by means of Agrobacterium mediated transformation system. This can lead to homologous intensification of overall metabolite flux with higher transcript levels of key regulatory genes correspondingly up-regulating downstream genes.

# GLUCOSYLTRANSFERASES

Plant metabolism involves glycosylation as a common modification reaction that is perpetually related with secondary metabolism. Enzymes leading to glucoside formation are called as uridine diphosphate glycosyltransferases (UGTs), members of family 1 of the glycosyltransferase superfamily, which contains over 80 families of enzymes (Campbell et al., 1997; Coutinho et al., 2003) and their functioning involves transferring a uridine diphosphate (UDP)-activated glucose to an equivalent acceptor molecule. UGTs use UDP-activated sugars as donors and allocate their sugar moiety to many acceptors. Plant family 1 UGTs catalyse the glycosylation of surplus bioactive natural compounds. This is frequently the concluding step for biosynthesis of various natural products (Jones and Vogt, 2001), to improve their stability and solubility, and to facilitate storage and build-up in plant cells. Over the years, many UGT gene sub-families have evolved for molecular glycosylation (Vogt and Jones, 2000; Jones and Vogt, 2001). UGTs functioning in secondary metabolism carry a conserved 44 amino acid residue motif (60–80% identity) called as the plant secondary product glucosyltransferase box (PSPG), validated to include the UDP–sugar binding moiety (Hughes and Hughes, 1994; Offen et al., 2006). Nevertheless, UGTs show comparatively meager levels of sequence identity, particularly in the regions associated with acceptor binding. This might be important for the recognition of many acceptors and synthesis of huge number of products.

Many of the pharmacological properties of W. somnifera are attributed to its distinctive steroidal compounds, called glycowithanolides (Matsuda et al., 2001; Singh et al., 2001; Misra et al., 2005; Lal et al., 2006). However, there exists scarcity in the information related to the metabolic step(s) leading to their glyco-transformations due to non-availability of the relevant enzymes and genes.

The first study on sterol glucosyltransferases (SGTs) from W. somnifera reported three different (SGTs) SGTL1, SGTL2, and SGTL3 comprising conserved SGT family domains (**Table 1**). Among these, SGTL1 was cloned in full length (DQ356887) and was found to be ubiquitously expressing in different parts of the plant. Deduced amino acid sequence of SGTL1 showed the presence of transmembrane domains and preference for membrane sterol glucosylation. Moreover, partially purified recombinant SGT displayed specificity to sterols with hydroxyl group at C-3 position. Functional recruitment of SGTL1 under environmental challenge(s) has also been reported in response to stress (Sharma et al., 2007). Further characterization of new sterol glucosyltransferases in W. somnifera can contribute to the disclosure of functions of various glycosterol in plants.

#### DIMINUTO/DWARF1 (DIM/DWF1)

DWF1 is an important gene of sterol biosynthesis and regulates the metabolic flux by carrying out isomerization, reduction, and epoxidation, respectively, of their immediate metabolic precursors. DWF1 coding for key enzyme is involved in isomerization and reduction of 24-methylene cholesterol to campesterol and isofucosterol to sitosterol (Klahre et al., 1998). GFP-DIM-DWF1 transient expression studies have shown that DWF1 is an integral membrane protein and is located to ER and not present in nucleus. Sequence homology with flavine adenine dinucleotide-binding domain (conserved in oxidoreductases) indicates that DWF1 has possibly a catalytic rather than a regulatory function. Mutant studies have established the functional role of DIM protein in plant phytosterol biosynthesis. Other homologs of DWF1 have been characterized from Arabidopsis thaliana, Pisum sativum, Zea mays, etc. and their function has been elucidated by mutant studies

In one of the reports Arabidopsis DIMINUTO/DWARF1 gene was found to encode for protein involved in phytosterol biosynthesis. Analysis of A. thaliana DIM mutant has revealed the dwarf phenotype with decreased fertility can be restored by addition of exogenous brassinolide. 24-Methylene cholestrol was found to accumulate in dim mutants. All of these mutants were deficient in campesterol, indicating toward the mutation in DIM/DWF1 severely affects the phytosterol biosynthesis in plants (Klahre et al., 1998).

Reports of some site directed mutants have revealed complete loss of function of DWF1 in plants due to loss of calmodulin binding, similarly complementation studies revealed that fractional loss of calmodulin binding led to partial dwarf phenotype (Du et al., 2009).

DWF1 gene homologs have been isolated from pea and maize, and their function have been elucidated by mutant studies The mutants have been shown to accumulate 24-methylene cholesterol and are severely dwarfed with reduced fertility. Biochemical and mutant studies have revealed the function of DIM homolog in pea known as LKB. The homolog for DIM in pea is LKB, it was observed that mutation in DIM gene directly correlated with the altered phenotype such as decreased enzyme activity, truncated internodal length, epinastic leaves, and thickened stem, and accumulation of 24-methylene cholesterol. Upon exogenous application of brassinolide the phenotype reverted to the normal type. Northern analysis showed the ubiquitous presence of LKB gene in the plant. Role of DWF1 is well-established in the brassinolide biosynthetic pathway. Analysis of dim mutants has revealed that DIM mutants lead to impaired biosynthesis of campesterol and consequent expression of a dwarf phenotype. 24-Methylene cholestrol is a shared precursor of both withanolide and brassinolide biosynthetic pathway (Choe et al., 1999).

DWF1 is a multifunctional enzyme which may carry out isomerization and reduction of post 24-methylene cholestrol intermediates upto withanolide biosynthesis, similar to its role in brassinolide biosynthesis. Increase in the transcript levels of DWFI in response to MJ, 2,4-dichlorophenoxyacetic acid (2,4-D) and SA and corresponding increase in the withanolide content indicates toward the involvement of DWF1 in withanolide biosynthesis along with other biosynthetic pathway genes. Increase in the withanolide content may be due to the increase in the 24-methylenecholestrol substrate pool which may be a consequence of increased expression of upstream genes of MVA pathway such as WsSQS, WsSQE, WsOSC/CS, WsOSC/LS, WsOSC/BS, WsCPR1, and WsCPR2 upon elicitation with abiotic factors such as MJ, 2,4-D, and SA. The presence of cis regulatory inducible elements present in the promoter region of genes involved in the withanolide biosynthetic pathway may be the reason behind the effect of elicitors.

Highest amount of DWF1 transcript level were found in leaves when compared to root and stalk tissues that corroborates well with the high levels of withanolides accumulation in leaves as explained in earlier studies. Also it supports the de novo synthesis of withanolides i.e., synthesis of withanolides in various parts of plant via complete metabolic pathway rather than transfer from any other plant tissue. The molecular cloning and characterization of DIM (GenBank Accession Number KP318739) from W. somnifera and transcript profiling data entails one of the recent studies carried out by the authors.

## TISSUE-SPECIFIC TRANSCRIPTOME ANALYSIS

More recently, for identification of putative biosynthetic pathway genes, there has been a huge swing toward the "omics" approach, that takes advantage of sequencing technologies to obtain genomic and transcriptomic sequence resource. For species where genomic data are unavailable, transcriptome sequencing through the use of differential expression studies is considered as a main way discovering novel genes in non-model organisms. Against this backdrop, to aid the basic understanding of withanolide biogenesis, transcriptome sequencing for Withania leaf (101L) and root (101R) synthesizing WS-3 and WS-1, respectively, has also been reported (Gupta et al., 2013b). Pyrosequencing results have yielded 834,068 and 721,755 reads assembled into 89,548 and 114,814 unique sequences from 101 L and 101 R. These are presumed to be involved in synthesis of tissue-specific withanolides. Annotations revealed all the genes involved in triterpenoid backbone biosynthesis that incorporated MVA and MEP pathways up to 24-methylene cholesterol, the apparent precursor for withanogenesis. Biosynthesis of 24-methylene cholesterol is followed by various secondary conversions including transfer of diverse moieties or oxidation/reduction reactions for structuring of tissue specific withanolides (Chaurasiya et al., 2012). Using gene Ontology and KEGG analyses, members of cytochrome P450, glycosyltransferase, and methyltransferase gene families with restricted presence or differential leaf and root expression have been reported. Quantification of reads for specific contig showed 305 contigs encoding CYP450s comprising of 12 and 36 unique CYPs for leaf and root tissues that may be responsible for the tissue specific difference in the activities counting withanolide biosynthesis. Unigene resource generated in this study also may be of immense value for interpretation of withanolide biosynthetic pathway and for search of tissue specific molecular mechanism fundamental for structuring definite withanolides (Gupta et al., 2013b).

#### COMPARATIVE PROTEOME ANALYSIS

Proteomic approach encompassing research centered on two dimensional electrophoresis (2-DE) and mass spectroscopy (MS) presents a new system for identifying known and unknown genes due to its ability to investigate hundreds of proteins simultaneously (Singh et al., 2015). This feasibility of proteomic analysis would make a considerable contribution in understanding the complex metabolic networks of withanolide biosynthesis in W. somnifera that could be a significant addition to the genomic knowledge resource. As a step forward, comprehensive 2-DE and MS analysis of in vitro grown adventitious roots and in vivo root samples of W. somnifera was conducted. The study showed a high similarity in protein spots of in vitro and in vivo root samples. Thus, suggesting that in vitro roots may have a analogous developmental route as that of in vivo roots though these are developed independent of shoot organs (Senthil et al., 2013). Cumulative proteome examination of leaf and seed tissues of W. somnifera differentiated the proteome on the basis of differential expression, count, and function of identified and characterized tissue-specific proteins. Relative examination of the two tissues further hinted that several proteins of common housekeeping pathways, while a few were tissue specific associated with definite metabolic complement (Dhar et al., 2012). Further, studies on low, abundant and poor soluble proteins would help in characterization of unknown pathway genes that are responsible for the production of withanolides.

Characterization of genes, their high throughput metabolic profiling, sequence resource and proteome information not only provides an insight into the withanolide biosynthetic pathway but also offers molecular wherewithals for biotechnological improvement of W. somnifera. However, in-depth knowledge about withanogenesis still remains elusive and all these results in totality could be useful to reveal various underlying signal transduction pathways to identify specific transcription factors in addition to uncharacterized downstream pathway genes. Further, such biosynthetic genes along with the transcription factors can become prospective targets for pathway engineering.

# TISSUE CULTURE APPROACHES FOR WITHANOLIDE PRODUCTION

Immense therapeutic value of W. somnifera attracts exploration of all possible approaches covering both recombinant DNA techniques and in vitro methods for obtaining chemotypes with desired enhanced chemoprofiles. Recombinant DNA or cell fusion techniques are viable alternatives, but are hampered by the lack of genetic and biochemical knowledge regarding the biosynthesis of secondary metabolites (Evans and Sharp, 1986). Thus, demanding immediate biotechnological advances to enhance the yield at a reduced time gap. On the other hand, in search of alternatives, in vitro techniques present a feasible option for the production of these therapeutically valuable compounds. Tissue culture techniques deliver unceasing, consistent, and renewable source of valued plant pharmaceuticals utilized for the large-scale culture of the plant cells necessary for extraction of secondary metabolites. Substantial work has been reported using different types of in vitro strategies for W. somnifera with main emphasis on manipulation of plant growth regulator adjuvants and cultural conditions for withanolide accumulation. The prospect of application of in vitro methods to produce cell/organ/root cultures for enhanced withanolide production is reviewed below.

# CELL SUSPENSION CULTURE

Various withanolides correspond to a very minor percentage of total withanolide content in the native plant. Investigation of condition adjusted cultures for resourceful in vitro biogeneration of such pharmacologically promising withanolides is important. Withanolide D, WS-1, WS-3, and WS-2 production have been described in organogenic cultures (Roja et al., 1991; Banerjee et al., 1994; Ray et al., 1996; Vitali et al., 1996; Ray and Jha, 1999; Furmanowa et al., 2001; Sangwan et al., 2007; Murthy et al., 2008). Several reports on accumulation of withanolide D and WS-3 in transformed roots/shooty teratomas cultures are also present, however WS-1 was reported to be absent in these cultures (Banerjee et al., 1994; Ray and Jha, 1999). Successful establishment of cell suspension cultures of W. somnifera for biogenesis of WS-1 has been reported and optimized for its enhanced accumulation.

Highest WS-1 content (1.27 mg g−<sup>1</sup> DWB) was observed in suspension cultures supplemented with 2.0 mg L−<sup>1</sup> 2,4-D, and 2.0 mg L−<sup>1</sup> 2,4-D + 0.5 mg L−<sup>1</sup> kinetin (KN). Thus, revealing that combination of 2,4-D with kinetin is the most suitable for enhanced WS-1 production. It has also been reported a combination of 1.0 ppm benzylaminopurine plus 0.5 ppm kinetin responsible for highest accumulation of WS-1 (14.3 mg per 100 g fresh weight and 238 mg per 100 g DWB, i.e., 0.24%) with the shoot cultures of W. somnifera (**Table 2**; Sangwan et al., 2007). Growth kinetics study of W. somnifera cell suspension cultures

revealed maximum accumulation of biomass (11.02 g L−<sup>1</sup> of DWB) and withanolide A (2.03 mg g−<sup>1</sup> DWB) at the end of the fourth week. Therefore, making it clear that the biomass growth is closely concurrent with WS-1 amassing. Inoculum density of 10 g L<sup>1</sup> was found to be the most suitable for maximum biomass (10.88 g L−<sup>1</sup> DWB) and highest production of withanolide A (2.42 mg g−<sup>1</sup> DWB). Among the different medias like Murashige and Skoog (MS), B5, NN, and N6, highest accumulation of WS-1 (2.39 mg g−<sup>1</sup> DWB) was observed with full strength MS medium suspension culture supplemented with 3% sucrose with an initial medium pH of 6.0 (**Table 2**; Nagella and Murthy, 2010).

Investigations have also been carried on the biotransformation capacity of cell suspension cultures generated from W. somnifera leaf using WS-1, WS-3, and WS-2 as precursor substrates. Interestingly, there was a noticeable inter-conversion of WS-1 to WS-2, and vice versa involving substitution of 20-OH group to 17-OH in WS-1 (Sabir et al., 2011). It displays the potential of suspension cultures of W. somnifera for the production of withanolides with multifactorial modulations.

# *IN VITRO* SHOOT CULTURE

Root specific production of WS-1 makes root cultures, particularly hairy roots, and its bioreactor based upscaling the foremost way for its in vitro production. However, presence of WS-1 has not been detected in Agrobacterium rhizogenetransformed hairy roots of W. somnifera (Banerjee et al., 1994; Ray and Jha, 1999). Withanolides detected in these hairy root cultures have been reported to be predominantly produced by the aerial parts. Therefore, WS-1 biogenesis was explored in W. somnifera shoot cultures that are the tissue culture complements of the aerial parts of the native plant. Shoot cultures were initiated using explants from the two experimental lines of W. somnifera (Ashwagandha)-RS Selection-1 (RS-Sel-1) and RS Selection-2 (RS-Sel-2) on MS medium with different plant growth regulators. RS-Sel-1 raised shoot culture supplemented with benzylaminopurine (BAP) 1.00 ppm and KN 0.50 ppm showed the highest concentration of WS-1 (14.3 mg per 100 g fresh weight and 238 mg per 100 g dry weight, i.e., 0.24%) in the green shoots (**Table 2**). Investigative quantities of green shoot cultures (0.24% DWB) was more as compared to the isolation yields of dried roots of field-grown plants. Solid mass/shooty teratoma of RS-Sel-1 raised shoot culture also displayed the highest concentration of WS-1 production (3.7 mg per 100 g fresh weight; 46.2 mg per 100 g dry weight) with BAP 1.00 ppm and kinetin 0.50 ppm. Comparatively, RS-Sel-1 proved to have superior biogenesis/accumulation of WS-1 than RS-Sel-2. Radioactivity fed shoot cultures as well led to isolation of almost pure radiolabeled WS-1 and pointed toward de novo biosynthesis of WS-1 in the in vitro shoot cultures (Sangwan et al., 2007).

The effect of hormones, culture conditions and elicitations on the production of withanolides in multiple shoot cultures of W. somnifera has also been reported. Elicitation with salicylic acid at 100µM in combination with 0.6 mg L−<sup>1</sup> 6-benzyladenine and 20 mg L−<sup>1</sup> spermidine for 4 h at the fourth week in 20 ml liquid medium reported 1.14− to 1.18-fold higher withanolide production in comparison to the elicitation treatment with MJ at 100µM after 5 weeks of culture (**Table 2**; Sivanandhan et al., 2013). Hence, confirming in vitro shoot culture biosystems as an alternative amenable to fine-tuning for harvesting therapeutically valuable WS-1 in comparison to field grown plants. Shoot cultures also represent a suitable system for functional genomic studies of withanolides.

# ROOT CULTURE

Hairy (transformed) roots mediated with A. rhizogenes holds immense potential for studies on secondary metabolite biosynthesis as rapid growth and extensive branching with genetic stability is their characteristic feature. They also exhibit capability of synthesizing root specific secondary metabolites (Giri and Narasu, 2000; Hu and Du, 2006). Consequently, hairy roots in numerous aromatic and medicinal plants for the production of significant secondary compounds have been induced (Le Flem-Bonhomme et al., 2004; Zhao et al., 2004; Santos et al., 2005). Murthy et al. (2008) reported transformation of W. sominifera with A. rhizogenes strain R1601 and obtained transformed hairy roots from cotyledons and leaf explants. Four clones of hairy roots differing in morphology were established. MS-based liquid medium supplemented with 40 g/L sucrose proved to be optimum for biomass building. WS-1 content was found to be 2.7-fold higher in transformed roots (line 3) in comparison to non-transformed roots (**Table 2**; Murthy et al., 2008).

Leaf derived callus of W. somnifera has also been used for development of adventitious root cultures in MS halfstrength medium supplemented with 0.5 mg L−<sup>1</sup> indole-3 butyric acid (IBA) and 0.1 mg L−<sup>1</sup> indole-3-acetic acid (IAA) with 2% sucrose. These adventitious root cultures were further elicited with MJ and SA autonomously to investigate the improvement in the productivity of withanolides. Root biomass (11.70 g FWB) on 30-day-old adventitious root cultures treated with 150µM SA for 4 h resulted in the production of 64.65 mg g−<sup>1</sup> DWB WS-1 (48-fold), 33.74 mg g−<sup>1</sup> DWB withanolide B (29-fold), 17.47 mg g−<sup>1</sup> DWB WS-3 (20 fold), 42.88 mg g−<sup>1</sup> DWB WS-2 (37-fold), 5.34 mg g−<sup>1</sup> DWB 12-deoxy withastramonolide (nine-fold), 7.23 mg g−<sup>1</sup> DWB withanoside V (seven-fold), and 9.45 mg g−<sup>1</sup> DWB withanoside IV (nine-fold) following elicitation of 10 days (40th day of culture) in comparison to untreated cultures (**Table 2**). Withanolide production was found to be dependent on biomass, culture age of the adventitious roots, elicitation concentration and time period involved (Sivanandhan et al., 2012). Thus, highlighting the considerable potential of transformed roots and adventitious roots of W. somnifera with further scale-up in bioreactors.

#### SOMACLONAL VARIANTS

Somaclonal variation can be either genetic or epigenetic in origin (Larkin and Scowcroft, 1981; Lee and Phillips, 1988). The occurrence of somaclonal variation is often associated with activation of transposable elements (Skirvin et al., 1994), point

#### TABLE 2 | *In vitro* studies in *Withania somnifera* in relation to withanolides production.


mutation, chromosomal rearrangement, recombination, DNA methylation, and altered sequence copy number. Somaclonal variation is influenced by explant type, culture medium, genotype, and the age of the donor plant, among other factors (Skirvin et al., 1994). It has been suggested that the frequency of somaclonal variation from cell culture is much higher than from field-grown plants because of a, higher rate of mutagenesis (Ahloowalia, 1986). Many plants regenerated via indirect organogenesis have shown somaclonal variation for a wide array of characteristics and this variation have been used to develop new varieties in some species, like tomato, sorghum, sugarcane, and chrysanthemum (Compton and Veilleux, 1991; Duncan et al., 1995; Jalaja et al., 2006; Miñano et al., 2009). Somaclonal variation could unleash the natural variability for withanolide production and accumulation, and could be exploited by breeders to develop W. somnifera varieties attracting commercial interest.

Rana et al. (2012) investigated and validated the applicability of an in vitro strategy to induce somaclonal variation in W. somnifera which manifested in the form of enhanced levels of 12 deoxywithastramonolide (WS-12D). Variations were examined in 54 regenerated plants obtained through indirect organogenesis from leaf explants. WS-R-1 somaclone displayed considerably elevated levels of WS-12D; 0.516% DWB in comparison to the explant donor mother plant (0.002% DWB). Somaclonal variations were investigated at cytological level, by investigating meiosis and mitosis in comparison to number of chromosome and structural organization. Chromosome phenotypes, somatic chromosome count, or meiotic behavior showed no alterations. Further, several genetic polymorphisms between explant donor mother plant and WS-12D over-producing somaclone was examined by random amplification of polymorphic DNA (RAPD) study. WS-R-1 somaclone was evaluated for 2 years to confirm genetic and chemical stability. This study supports the feasibility of an in vitro strategy for chemotypic variability induction to develop high-yielding clones considering the molecular instability displayed by W. somnifera. It also widens the genetic resource base for manipulative hybridization for quantitative chemotypic novelty in W. somnifera (Rana et al., 2012).

#### FUTURE PROSPECTS

W. somnifera has enjoyed a long and important history in traditional medicine system wherein withanolides are attributed with significant remedying properties. Nevertheless, withanolide biosynthesis is still in its infancy with regard to being understood in entirety that enormously hampers the exploitation of its full biotechnological potential. Though, investigations at molecular and in vitro levels have begun, but we are still a long way from understanding how diverse withanolides are synthesized and regulated in W. somnifera. However, the gene elucidation data, omics resource and in vitro study inferences generated so far offers significant promise for enormous increase in correct annotation, functional characterization of enzymes and for comprehending the assorted interactions amongst sophisticated biosynthetic and regulatory mechanisms crucial for successful implementation of withanolide metabolic engineering strategies. Furthermore, advancing metabolic engineering technologies for transgenics, precursor feeding, gene overexpression and inhibition and mutant selection in W. somnifera still awaits investigation. There is much to be learned about the chemical ecology of withanolides to answer an important question about their evolution in the form of sophisticated and diverse structures and types. Though, anticipation about withanolides acting as growth regulators owing to their partial coinciding biosynthetic route with brassinosteroids do exist, but it further demands indepth examination to build a framework for elaborate pathway modulation strategies.

## REFERENCES


#### AUTHOR CONTRIBUTIONS

SL, RV conceived and designed the review. ND and SL wrote the manuscript. ND, S Razdan, S Rana, WB, SL, and RV have contributed in the original studies published earlier vis-à-vis W. somnifera. ND, S Razdan, S Rana, and WB have also collated the up-to-date literature. All authors read and approved the final manuscript.

#### ACKNOWLEDGMENTS

We gratefully acknowledge the financial grant from Council of Scientific and Industrial Research, Government of India, New Delhi under Network Projects NWP-0008 and BSC-0108. This manuscript represents institutional communication number IIIM/1793/2015.


from the roots of Withania somnifera Dun. Phytother. Res. 15, 311–318. doi: 10.1002/ptr.858


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Dhar, Razdan, Rana, Bhat, Vishwakarma and Lattoo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Modulation of Chloride Channel Functions by the Plant Lignan Compounds Kobusin and Eudesmin

*Yu Jiang1†, Bo Yu1†, Fang Fang1, Huanhuan Cao1, Tonghui Ma2 and Hong Yang1\**

*<sup>1</sup> School of Life Sciences, Liaoning Provincial Key Laboratory of Biotechnology and Drug Discovery, Liaoning Normal University, Dalian, China, <sup>2</sup> College of Basic Medical Sciences, Dalian Medical University, Dalian, China*

Plant lignans are diphenolic compounds widely present in vegetables, fruits, and grains. These compounds have been demonstrated to have protective effect against cancer, hypertension and diabetes. In the present study, we showed that two lignan compounds, kobusin and eudesmin, isolated from *Magnoliae Flos*, could modulate intestinal chloride transport mediated by cystic fibrosis transmembrane conductance regulator (CFTR) and calcium-activated chloride channels (CaCCs). The compounds activated CFTR channel function in both FRT cells and in HT-29 cells. The modulating effects of kobusin and eudesmin on the activity of CaCCgie (CaCC expressed in gastrointestinal epithelial cells) were also investigated, and the result showed that both compounds could stimulate CaCCgie-mediated short-circuit currents and the stimulation was synergistic with ATP. In *ex vivo* studies, both compounds activated CFTR and CaCCgie chloride channel activities in mouse colonic epithelia. Remarkably, the compounds showed inhibitory effects toward ANO1/CaCC-mediated short-circuit currents in ANO1/CaCC-expressing FRT cells, with IC50 values of 100 µM for kobusin and 200 µM for eudesmin. In charcoal transit study, both compounds mildly reduced gastrointestinal motility in mice. Taken together, these results revealed a new kind of activity displayed by the lignan compounds, one that is concerned with the modulation of chloride channel function.

Keywords: CFTR, CaCCs, ANO1/CaCC, kobusin, eudesmin, short-circuit current

# INTRODUCTION

Plant lignans are widely distributed in vegetables, fruits, and grains, especially in rye, flax, and sesame seeds. It has been reported that lignans have various biological activities, including anti-cancer, anti-diabetic, antimicrobial, antiparasitic, and antihypertensive activities (Dar and Arumugam, 2013; Chun et al., 2014; Zhang et al., 2014a). However, little information is available on lignan compounds and chloride channels.

Active Cl− secretion mediated by chloride channels provides a driving force for the transepithelial fluid secretion in the apical membrane of the intestines. It has been fully established that cystic fibrosis (CF) transmembrane conductance regulator (CFTR) and calciumactivated chloride channels (CaCCs) are the main chloride channels present in the luminal membrane of enterocytes (Riordan et al., 1989; Hartzell et al., 2005). CFTR is a cAMPdependent chloride channel predominantly expressed in the crypt cells in the intestines, and is permeable to Cl<sup>−</sup> and HCO3 − (Zhang et al., 2012). ANO1/CaCC (TMEM16A) is the

#### *Edited by:*

*Domenico De Martinis, ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Loretta Ferrera, U.O.C. Genetica Medica- I. G. Gaslini, Italy Anna Boccaccio, National Research Council, Italy*

*\*Correspondence:*

*Hong Yang hyanglnnu@126.com*

*†These authors have contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 09 September 2015 Accepted: 09 November 2015 Published: 25 November 2015*

#### *Citation:*

*Jiang Y, Yu B, Fang F, Cao H, Ma T and Yang H (2015) Modulation of Chloride Channel Functions by the Plant Lignan Compounds Kobusin and Eudesmin. Front. Plant Sci. 6:1041. doi: 10.3389/fpls.2015.01041*

first molecular identity of CaCCs that was found to express abundantly in the intestinal pacemaker Cajal cells, where it generates smooth muscle contraction (Huang et al., 2009; Hwang et al., 2009; Ferrera et al., 2010). CaCCgie (CaCC that located in the gastrointestinal epithelial cells), which is CaCC apart from ANO1, is predominantly localized in the gastrointestinal epithelial cells and is involved in fluid secretion, though its molecular identity remains unclear.

Hyper activation of CFTR and CaCC proteins may account for such diseases as secretory diarrhea (Morris et al., 1999) and autosomal dominant polycystic kidney disease (Li et al., 2004), while dysfunction of these proteins may lead to CF (Kerem et al., 1989; Riordan et al., 1989), chronic pancreatitis (Cohn, 2005) as well as constipation (Morris et al., 1999). Chronic constipation (CC) is a common symptom characterized by infrequent stools and/or difficult stool passage (Lembo and Camilleri, 2003). The etiology of constipation is very complicated and may include diet, impaired colonic motility, behavioral and psychological factors (Lu et al., 2015). Current treatments are mainly based on dietary management and the use of laxatives, which usually show discouraging results. Therefore there is an urge to find new strategies for CC therapy. During the last decade, emphasis has been placed on increasing the intestinal fluid secretion and gastrointestinal motility as a new therapeutic option for the treatment of CC.

Previously, we have set up a high throughput screening strategy for identifying natural active compounds against chloride channels (Zhang et al., 2014b; Chen et al., 2015). Based on this strategy, we found a large number of compounds, including two lignan compounds, kobusin and eudesmin, which had CFTR and CaCC Cl− channel modulation activities. The aim of the present study was to systematically investigate the modulation effects of kobusin and eudesmin on CaCCs and CFTR chloride channel activities. We demonstrated for the first time that plant lignan compounds could modulate intestinal chloride transport mediated by CFTR and CaCCs chloride channels.

# MATERIALS AND METHODS

#### Cell Lines, Animals, and Compounds

Cell lines used in this study were FRT (fischer rat thyroid epithelial) cells stably co-transfected with the YFP-H148Q fluorescence protein and human wild-type CFTR cDNA (Clarke et al., 2001; Harmon et al., 2010) or ANO1 cDNA (Hao et al., 2011) and HT-29 cells. FRT cells were cultured in Nutrient F12 coon's medium (Sigma Chemical Co. St. Louis, MO, USA). HT-29 cells were cultured in 1640 medium (Sigma Chemical Co. St. Louis, MO, USA). Both media were supplemented with 10% fetal bovine serum (HyClone company, USA), 100 u/ml penicillin, 100 µg/ml streptomycin and 2 mM L-glutamine. The cells were incubated in a 5% CO2 incubator maintained at 37◦C and 95% humidity before they were used for iodide influx fluorescence study and short-circuit current measurement.

Male ICR mice (8–10 weeks) were fed a standard chow diet and kept under specific pathogen-free conditions at Dalian Medical University (Permit Number: SCXK liao 2008-0002). All animal experiments were conducted in accordance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health and were approved by the Liaoning Normal University Committee on Animal Research.

CFTRinh-172 was synthesized as described previously (Garcia et al., 2009). Forskolin (FSK), genistein (Gen), indomethacin, amiloride and tannic acid were all purchased from Sigma (Sigma Chemical Co, St. Louis, MO, USA). Amphotericin B was purchased from Solarbio (Beijing Solarbio Science & Technology Co, Ltd). CaCCinh-A01 and Eact were obtained from Chembest Research Laboratory Limited (Shanghai). ATP and NaI were purchased from Sangon Biotech (Shanghai) Co, Ltd. Kobusin and eudesmin were isolated and purified in our own laboratory and their chemical structures are shown in **Figure 1**.

# Iodide Influx Fluorescent Assay

FRT cells transfected with CFTR were plated in a black 96 well plate with clear bottom (Costar, Corning, NY, USA) at a density of 2 <sup>×</sup> 104 cells/well and incubated until confluent. The cells were washed three times with PBS followed by the addition of FSK (100 nM per well) and a further incubation of 5 min. After that, kobusin or eudesmin was added to each well at different concentrations and the cells were incubated for another 15 min. YFP fluorescence data were recorded using a FLUOstar Galaxy microplate reader (BMG Lab Technologies, Inc.) equipped with HQ500/20X (500 ± 10 nm) excitation, HQ 535/30M (535 ± 15 nm) emission filters (Chroma Technology Corp.) and syringe pumps. Iodide influx rates (d[I–]/dt) were computed as described by Kristidis et al. (1992).

# Short-circuit Current

Snapwell inserts containing ANO1-expressing FRT cells and HT-29 cells were mounted in Ussing chambers (Physiological Instruments, San Diego, CA, USA). For FRT cells, the hemichambers were filled with 5 ml of half-Cl− solution (apical) and HCO3 − buffered solution (basolateral). The half-Cl− solution contained 65 mM NaCl, 65 mM Na Gluconate, 2.7 mM KCl, 1.5 mM KH2PO4, 0.5 mM MgCl2, 2 mM CaCl2, 10 mM Hepes, 10 mM Glucose, and 25 mM NaHCO3 at pH 7.4. The HCO3 − buffered solution contained 120 mM NaCl, 5 mM KCl, 1 mM MgCl2, 1 mM CaCl2, 10 mM glucose, 5 mM Hepes, and 25 mM NaHCO3 at pH 7.4. Snapwell inserts were mounted in Ussing Chamber systems, with the resistance kept above 1500 -. Basolateral membrane was permeabilized with amphotericin B (250 µg/ml). For HT-29 cells, the Krebs' buffered solution

contained 130 mM NaCl, 2.7 KCl, 1.5 mM KH2PO4, 0.5 mM MgCl2, 2 mM CaCl2, 10 mM Hepes, 10 mM glucose at pH 7.4. Symmetrical HCO3 − buffered solutions contained 119 mM NaCl, 0.6 mM KH2PO4, 2.4 mM K2HPO4, 1.2 mM MgCl2, 1.2 mM CaCl2, 21 mM NaCO3, 10 mM glucose at pH 7.4. The cells were bathed in the buffered solution for 15 min at 37◦C in the presence of 95% O2/5% CO2

Male ICR mice were sacrificed by an overdose of intraperitoneal sodium pentobarbital. The colon was removed as quickly as possible and washed with ice-cold modified Krebs-bicarbonate solution containing 120 mM NaCl, 5 mM KCl, 1 mM MgCl2, 1 mM CaCl2, 10 mM D-glucose, 5 mM Hepes, and 25 mM NaHCO3 at pH 7.4. After stripping of the muscularis, the tissue was mounted in an Ussing Chamber system (Physiological Instruments) connected to a VCC MC 6 multi-channel voltage-current clamp via silver/AgCl electrodes and 3 M KCl Ag bridges. The hemi-chambers were separately filled with 5 ml modified Krebs-bicarbonate solution bubbled with 95% O2/5% CO2 at 37◦C. The hemi-chambers were filled with buffer solution containing 10 µM indomethacin to prevent the influence of prostaglandin, and the mucosal side of the tissue was exposed to 10 µM amiloride to inhibit epithelial Na+ current. Short-circuit current was recorded using Acquire and Analyze 2.3 software, with the transepithelial potential clamped at 0 mV during the whole experiment.

#### Intestinal Motility Measurement

ICR mice were starved for 24 h and the animals were then orally administered PBS, 400 µM kobusin or eudesmin. Fifteen minutes later, the animals were administered 200 µl of 10% activated charcoal diluted in 5% gum Arabic. Thirty minutes after the administering of activated charcoal the animals were sacrificed and the small intestines were removed. Peristaltic index was calculated as the ratio of the length that activated charcoal traveled to the total length of the small intestine.

## Statistical Analysis

All data were expressed as mean ± SE or as representative traces. Student's *t*-test was used to compare test and control values, and statistical significances were considered at the *P* < 0.05 level.

# Ethics Statement

This study was carried out in accordance with the recommendations of "Guide for the Care and Use of Laboratory Animals of the National Institutes of Health" and were approved by the Liaoning Normal University Committee on Animal Research. All surgery was performed under sodium pentobarbital anesthesia, and possible efforts were made to minimize suffering.

# RESULTS

## Activation of CFTR Cl**−** Channel Activity by Kobusin and Eudesmin

Activation effect of kobusin or eudesmin on CFTR chloride channel activities were tested using a cell-based fluorescence assay using FRT cells transfected with human CFTR cDNA (Ma et al., 2002). A known CFTR activator Gen (Hwang et al., 1997) was used as a positive control. FSK (100 nM) was added to the cells to acquire a basal level of cAMP (**Figures 2A,B**). Kobusin and eudesmin activated CFTR chloride channel activity in a dose-dependent manner with EC50 values of 30 and 50 µM, respectively, for kobusin and eudesmin (**Figure 2C**). Further experiments showed that the activation effect of these compounds could be inhibited by gradient concentrations of the known CFTR inhibitor CFTRinh-172 (**Figure 2D**). CFTR activation can be achieved by direct interaction with CFTR protein or activation of upstream cAMP-dependent PKA signaling pathway (Hwang and Sheppard, 1999; Schultz et al., 1999; Sheppard and Welsh, 1999). To investigate the mechanisms involved in the activation, we measured the activities of kobusin and eudesmin under different FSK concentrations. Kobusin was effective at inducing CFTR-mediated iodide influx in the absence of FSK, although the potency was relatively weaker than that in the presence of FSK (**Figure 2E**). On the other hand, activation of CFTR by eudesmin depended on cAMP level and phosphorylation level of CFTR more than kobusin, which is that eudesmin showed a stronger activation effect under high concentrations of FSK (**Figure 2F**). The results suggested that eudesmin's efficacy is more dependent on the phosphorylation level of CFTR than kobusin.

# Activation of CFTR Chloride Channel Activities by Kobusin and Eudesmin in HT-29 Cells

CFTR and CaCCs are endogenously expressed in HT-29 cells (Morris and Frizzell, 1993), and therefore, short-circuit current experiment was performed to investigate the kobusin- and eudesmin-induced activation effect on CFTR chloride channel activity in HT-29 cells. All tests were done in the presence of 30 µM of the CaCC-specific inhibitor CaCCinh-A01 (De La Fuente et al., 2008) to eliminate the influence of endogenous CaCC current. **Figure 3A** shows that both kobusin and eudesmin alone could increase the CFTR-mediated short-circuit currents in a dose-dependent manner. Kobusin and eudesmin both elicited more potent CFTR-mediated short-circuit currents in the presence of 100 nM FSK (**Figure 3B**). Summarized data are shown in **Figures 3A,B** (down panels).

# Potentiation of CaCC Chloride Channel Activity by Kobusin and Eudesmin in HT-29 Cells

As CaCCgie is endogenously expressed in HT-29 cells (Morris and Frizzell, 1993), we wanted to know what effect kobusin and eudesmin would exert on this kind of chloride channel. **Figure 4A** indicates that both kobusin and eudesmin could activate short-circuit current, and the activation effect could be abolished by the known non-specific CFTR and CaCC inhibitor tannic acid (100 µM). After pretreatment with CFTRinh-172 (20 µM), kobusin and eudesmin further activated the shortcircuit current, and the activation effect was inhibited by the specific CaCC inhibitor CaCCinh-A01 (**Figure 4B**), suggesting

that both compounds were able to activate CaCC-mediated Cl− current.

To evaluate whether the two lignan compounds activated CaCCgie the way ATP does, we measured the activation effects of the lignan compounds added before and after the addition of ATP. Kobusin and eudesmin induced a higher short-circuit current than DMSO (which served as a control) regardless of whether the compound was added before or after the addition of ATP (**Figure 5**). Kobusin (50 <sup>µ</sup>M) induced a higher short-circuit current than 50 µM DMSO (which served as a control) and achieved a short-circuit current increase of 2.6-folds at 50 µM when kobusin was added before addition of ATP (**Figure 5A**). When added after addition of ATP, kobusin achieved a shortcircuit current increase of 3.7-folds at 50 <sup>µ</sup>M (**Figure 5B**). Similar to kobusin, eudesmin achieved a short-circuit current increase of 2.4-folds and at 50 µM when added before addition of ATP (**Figure 5C**). When added after addition of ATP, eudesmin achieved a short-circuit current increase of 2.15-folds at 50 µM (**Figure 5D**). The results thus indicated the presence of synergistic effect between kobusin or eudesmin and ATP.

#### Inhibitory Effects of ANO1 Chloride Channel Activities by Kobusin and Eudesmin in ANO1-expressing FRT Cells

ANO1 is the first identified molecular component of CaCCs, and thus we investigated the effects of kobusin and eudesmin on ANO1 chloride channel activity. Eact Namkung et al. (2011b) was used to produce an ANO1-mediated short-circuit current, followed by indicated concentrations of kobusin or eudesmin additions. The remaining currents were abolished by T16Ainh-A01 (Namkung et al., 2011a). The results showed that apical

application of kobusin and eudesmin inhibited Eact-induced ANO1-mediated short-circuit currents in transfected FRT cells in a dose-dependent manner with IC50 values of 100 µM for kobusin (**Figure 6A**) and 200 <sup>µ</sup>M for eudesmin (**Figure 6B**). Statistical analysis is shown in **Figures 6A,B** (down panels). Eactinduced ANO1-mediated short-circuit current was completely abolished by the specific inhibitor of ANO1 T16Ainh-A01 (**Figure 6C**).

# Activation of CFTR and CaCC Chloride Channel Activities by Kobusin and Eudesmin in Mouse Colonic Epithelia

As CFTR and CaCCs are the major pathway for apical Cl− exit in the intestine, the efficacies of kobusin and eudesmin were tested *ex vivo* in isolated mouse colonic mucosa by shortcircuit current analysis. The experiments were performed in the presence of 10 µM indomethacin and 10 µM amiloride to eliminate the influence of prostaglandin generation and Na+ transport. Kobusin and eudesmin increased the short-circuit currents in a dose-dependent manner in mouse colonic epithelia (**Figures 7A,B**). As expected, the activation effect was completely abolished by 100 µM CFTRinh-172 plus 100 µM CaCCinh-A01, but was only partially inhibited by CFTRinh-172 or CaCCinh-A01 alone.

## Inhibition of Intestinal Motility by Kobusin and Eudesmin

Since ANO1 is expressed in the pacemaker cells that generate smooth muscle contraction in the gastrointestinal tract (Huang et al., 2009; Hwang et al., 2009; Ferrera et al., 2010), more experiments were performed *in vivo* to evaluate the inhibitory effects of kobusin and eudesmin on gastrointestinal motility. Oral administration of either kobusin or eudesmin inhibited intestinal peristalsis and delayed charcoal movement in mice,

(A) Representative current traces showing kobusin (left panel) and eudesmin (right panel) -stimulated Cl− current. The activation effect was abolished by tannic acid (100 µM). (B) Representative traces showing kobusin (left panel) and eudesmin (right panel) -induced CaCC-mediated Cl− current. The activation effect was abolished by CaCCinh-A01 (30 µM). Histograms showing summary of kobusin and eudesmin-induced short-circuit currents. 16A: T16Ainh-A01. A01: CaCCinh-A01. Data are the means ± SEs of three independent tests.

with peristaltic indexes of 70.6 ± 5.9% in the case of kobusin and 68.2 ± 5.9% in the case of eudesmin, compared to that of PBS (82.3 <sup>±</sup> 2.9%; **Figures 8A,B**).

#### DISCUSSION

Lignans, described as a group of diphenolic compounds where the C6-C3 carbons are bound by the C8 central carbon, are widely distributed in more than 70 families of vascular plants. They have been isolated from different parts of a plant, including roots, stems, rhizomes, leaves, seeds, and fruits as well as the exudates and resins (Gang et al., 1997; Pan et al., 2009). Although neither non-nutrient nor noncaloric, lignans have attracted considerable attention because of their various biological activities. Numerous studies have shown that lignans and their intestinal metabolites enterolignans possess antitumor, antiviral, and antioxidant activities (Chen et al., 1997; Ashakumary et al., 1999; Saarinen et al., 2002). Furthermore, they have also been shown to have osteoporosis

prevention and liver-protection activities as well as antagonistic activity toward platelet-activating factor (PAF; Han et al., 1992; Habauzit and Horcajada, 2008). In the present study, we demonstrated that two lignan compounds, kobusin and eudesmin, could function as activators of CFTR and CaCCgie chloride channels and inhibitors of ANO1/CaCC channel. The results revealed that kobusin and eudesmin could activate the function of CFTR and CaCCgie chloride channels. Notably, we

found that kobusin and eudesmin could inhibit the activities ANO1/CaCC chloride channel in ANO1/CaCC-expressing FRT cells, and reduce gastrointestinal motility in mice, thereby uncovering new molecular pharmacological targets of lignan compounds.

The identification of lignan compounds as CFTR and CaCCgie activators would highlight their potential uses as lead drugs for the treatment of constipation. Intestinal fluid secretion provides a proper environment for digestion and facilitates stool passage through the intestinal tract (Barrett and Keely, 2000), and this process is driven by chloride channel-mediated Cl− transport in the enterocyte (Barrett and Keely, 2000; Kiela and Ghishan, 2009). So far, three chloride channels (namely CFTR, CaCC, and ClC-2) have been identified to mediate Cl− secretion into the intestinal lumen side, and among these, CFTR and CaCC play pivotal roles (Murek et al., 2010). In the intestine, CFTR, a cAMP-activated chloride channel, is mainly expressed in the crypt (Zhang et al., 2012). Mutation in the CFTR protein (e.g., F508-CFTR) may result in the hereditary lethal disease of CF (Lubamba et al., 2012). Habitual constipation remains a common symptom among CF patients, which is regarded as a consequence of impaired intestinal fluid secretion (Grubb and Gabriel, 1997). Since the cloning of the CFTR gene back in 1989 (Riordan et al., 1989), CFTR has been advocated as a potential therapeutic molecular target for the treatment of several diseases, including CF (Kerem et al., 1989; Riordan et al., 1989), chronic pancreatitis (Cohn, 2005), habitual constipation (Morris et al., 1999), secretory diarrhea and autosomal dominant polycystic kidney disease (Li et al., 2004). Though the molecular identity of CaCCgie still remains elusive, its existence in enterocyte has been fully confirmed (De La Fuente et al., 2008). CaCCgie is responsible for rotaviral enterotoxin-stimulated diarrhea (Ko et al., 2014). In the present study, we demonstrated that kobusin and eudesmin could potentiate the two major pathways of Cl− secretion, suggesting that mild activation of CFTR and CaCCgie may result in a significant activation of fluid secretion in the intestine. However, kobusin and eudesmin also inhibited the activity of ANO1/CaCC chloride channel (**Figure 6**). Kobusin and eudesmin inhibited ANO1/CaCCmediated short-circuit current in transfected FRT cells, and reduced gastrointestinal motility in mice. ANO1/CaCC is highly expressed in the pacemaker Cajal cells of the gastrointestinal tract (Huang et al., 2009; Hwang et al., 2009; Ferrera et al.,

2010). Inhibition of ANO1/CaCC may delay the movement of the intestine and thus increase the fluid absorption time (Hwang et al., 2009). Thus the neutralization effect of these compounds on fluid secretion needs to be fully considered.

The study of lignan supplementation in some randomized controlled trials has indicated that lignans cause a mild but significant reduction in diastolic or/and systolic blood pressure in patients with hypertension (Ursoniu et al., 2015). Raimundo et al. (2009) reported that eudesmin can induce endothelium-dependent relaxation in rat aorta. The molecular mechanism of this effect still remains elusive. Accumulating evidence suggests that ANO1/CaCC plays important roles in the pathogenesis of spontaneous hypertension. Moreover, inhibition of ANO1/CaCC channel activity can reduce blood pressure in rodents so that spontaneous hypertension can be inhibited (Wang et al., 2015). The fact that kobusin and eudesmin could inhibit the activity of ANO1/CaCC channel both *in vitro* and *in vivo* suggested that inhibition of ANO1/CaCC may in part account for the antihypertension activities of the lignans.

Both epidemiological data and experimental evidence have indicated that lignans or lignan-rich food possess anticarcinogenic activities against many types of cancer, including breast (Saarinen et al., 2007), prostate (McCann et al., 2005), and colon (Webb and McCullough, 2005) cancers. The mechanisms involved in the cancer prevention effects are related to the anti-estrogenic, anti-angiogenic, pro-apoptotic, and antioxidant activities of these compounds (Webb and McCullough, 2005). Recently, it has been confirmed that ANO1/CaCC is overexpressed in several tumors (Qu et al., 2014; Wanitchakool et al., 2014), and inhibition of ANO1/CaCC channel activity may suppress the proliferation and migration of cancer cells (Sui et al., 2014; Seo et al., 2015). The inhibition effect of kobusin and eudesmin on ANO1/CaCC channel activities observed in this study may provide new insight into the molecular mechanism associated with the anticancer effect of lignans.

The botanical properties of lignans have not been unveiled. Although prevalent in plants, lignans are virtually not found in animals (Peterson et al., 2010). The biosynthesis pathways of lignans are thought to have evolved in plants during their adaptation to the land (Davin and Lewis, 2000). Accumulating evidence shows that lignans are produced by phenoxy-radical coupling and polymerization, in which the dirigent proteins play key roles in determining the regiospecificity and stereoselectivity of the compounds (Gang et al., 1999), while the precise molecular mechanism is still unclear. In general, the lignan content in food is low except for flax seed, rye bran, and sesame seeds (Hallmans et al., 2003; Peterson et al., 2010). The demand for lignans has been increasing rapidly. The inefficiency and instability of plant lignan production means that there is an urgent need for a new technology to produce lignan. Recent studies have shed light on the production of lignans using transgenic plants and cells (Satake et al., 2013). Previous study has reported that ATP can activate CaCC chloride channel activities through both PLC and intracellular Ca2<sup>+</sup> pathways (Rajagopal et al., 2011). We detected a synergistic effect between kobusin or eudesmin and ATP, which suggested that these compounds activated CaCCgie Cl<sup>−</sup> channel function in a way that may differ from ATP. Furthermore, we would like to know how these two lignan compounds inhibited ANO1 chloride channel function. Since CaCCs are Ca2+-activated chloride channels, the inhibition of Ca2<sup>+</sup> release may impair the function of ANO1/CaCC or CaCCgie channels. Zhang et al. (2013) reported that the lignan compound magnolol can inhibit colonic motility through down-regulating the voltage-sensitive L-type Ca2<sup>+</sup> channel activities in rat colonic smooth muscle cells. Although both FRT and HT-29 cells express L-type calcium channels (Montiel et al., 2001; Perego et al., 2012), the adverse effect of kobusin and eudesmin on CaCCgie and ANO1 chloride channel activities observed in our study did not support the L-type Ca2<sup>+</sup> channel inhibition pathway. The detailed mechanisms will need further investigation.

# CONCLUSION

The present study discovered the modulation of chloride channel function as a new activity of the lignan compounds kobusin and eudesmin, thereby uncovering new insights into the mechanism

## REFERENCES


relating to the antihypertension and cancer prevention activities of lignans in general.

#### ACKNOWLEDGMENT

This work was supported by National Natural Science Fund (No. 31471099; 81173109; 81473265), Special Fund for Doctorate Disciplines Construction in Universities (20112136110002) and the Youth Foundation of Liaoning Normal University (No. LS2014L010).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Jiang, Yu, Fang, Cao, Ma and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Bioconversion to Raspberry Ketone is Achieved by Several Non-related Plant Cell Cultures

*Suvi T. Häkkinen, Tuulikki Seppänen-Laakso, Kirsi-Marja Oksman-Caldentey and Heiko Rischer\**

*VTT Technical Research Centre of Finland Ltd., Espoo, Finland*

Bioconversion, i.e., the use of biological systems to perform chemical changes in synthetic or natural compounds in mild conditions, is an attractive tool for the production of novel active or high-value compounds. Plant cells exhibit a vast biochemical potential, being able to transform a range of substances, including pharmaceutical ingredients and industrial by-products, via enzymatic processes. The use of plant cell cultures offers possibilities for contained and optimized production processes which can be applied in industrial scale. Raspberry ketone [4-(4-hydroxyphenyl)butan-2-one] is among the most interesting natural flavor compounds, due to its high demand and significant market value. The biosynthesis of this industrially relevant flavor compound is relatively well characterized, involving the condensation of 4-coumaryl-CoA and malonyl-CoA by Type III polyketide synthase to form a diketide, and the subsequent reduction catalyzed by an NADPH-dependent reductase. Raspberry ketone has been successfully produced by bioconversion using different hosts and precursors to establish more efficient and economical processes. In this work, we studied the effect of overexpressed *Ri*ZS1 in tobacco on precursor bioconversion to raspberry ketone. In addition, various wild type plant cell cultures were studied for their capacity to carry out the bioconversion to raspberry ketone using either 4-hydroxybenzalacetone or betuligenol as a substrate. Apparently plant cells possess rather widely distributed reductase activity capable of performing the bioconversion to raspberry ketone using cheap and readily available precursors.

*Edited by:*

*Rosella Franconi, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Biswapriya Biswavas Misra, University of Florida, USA Doriana Triggiani, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

> *\*Correspondence: Heiko Rischer heiko.rischer@vtt.fi*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 16 September 2015 Accepted: 06 November 2015 Published: 24 November 2015*

#### *Citation:*

*Häkkinen ST, Seppänen-Laakso T, Oksman-Caldentey K-M and Rischer H (2015) Bioconversion to Raspberry Ketone is Achieved by Several Non-related Plant Cell Cultures. Front. Plant Sci. 6:1035. doi: 10.3389/fpls.2015.01035*

Keywords: betuligenol, bioconversion, 4-hydroxybenzalacetone, plant cell culture, raspberry ketone

# INTRODUCTION

The characteristic aroma component in raspberry (*Rubus idaeus*) fruits is 4-(4-hydroxyphenyl) butan-2-one, also called raspberry ketone or frambinone (Feron et al., 2007). The amount of raspberry ketone in raspberry fruits is only around 1–4 mg/kg fruits. Raspberry ketone is one of the most expensive flavor compounds. Natural raspberry ketone flavor ranks second behind natural vanillin, with a total potential market value between 6 and 10 million euros (Feron and Wache, 2005), although currently the commercial demand cannot be satisfied. The EC Flavor Directive (88/388/EEC) defines natural flavors as 'flavoring substances or preparations which are obtained by appropriate physical processes or enzymatic or microbiological processes from material of vegetal or animal origin.' Natural flavors include products obtained through microbial or enzymatic processes as long as the precursor/raw material is natural and obtained via physical or bio-processes and the precursor and product can be found in nature or as components of traditional foods. Products that occur in nature but are produced via a chemical (non-biological) process are called 'nature-identical'; this mode of production is no longer accepted as consumer-friendly (Vandamme and Soetaert, 2007). In addition to traditional flavoring applications, raspberry ketone has attracted wide interest in the so-called cosmaceutical industry, for its skin-lightening and weight-loss properties (Morimoto et al., 2005; Park, 2010; Lin et al., 2011).

The biosynthesis of this industrially sought-after flavor compound is a comparatively well characterized diketide pathway, involving the condensation of 4-coumaryl-CoA and malonyl-CoA. First coumaroyl-CoA and malonyl-CoA form *p*-hydroxybenzalacetone (4-OHBA) in a decarboxylative condensation catalyzed by a Type III polyketide synthase called benzalacetone synthase (BAS) (Abe et al., 2001). Several candidate polyketide synthases from *Rubus* have been identified (*Ri*PKS1-5), and among those *Ri*PKS4 exhibits a specific C-terminal sequence which is different from the usually conserved region in chalcone synthases (CHSs) (Zheng et al., 2001; Zheng and Hrazdina, 2008). *Ri*PKS4 has both BAS and CHS activity *in vitro*, and therefore selective blocking of the CHS activity of *Ri*PKS4 would be difficult to achieve. Instead it seems very likely that the reaction toward benzalacetone (and further to raspberry ketone) is determined by precursor availability and especially the interaction with a specific reductase. Koeduka et al. (2011) identified an NADPH-dependent reductase from raspberry, called raspberry ketone/zingerone synthase 1 (*Ri*ZS1), which was suggested to be responsible for the last step in raspberry ketone biosynthesis. However, this gene has not hitherto been functionally tested *in planta*.

Plant cell cultures have been studied as useful agents for various biotransformation reactions of organic compounds, including oxidation, reduction, hydroxylation, esterification, methylation, isomerization, hydrolysis and glycosylation (Giri et al., 2001; Ishihara et al., 2003). Raspberry ketone has been produced via bioconversion with different hosts including bacteria, yeast and plant cells, using e.g., *p*-coumaric acid, benzoic acid, benzaldehyde, 4-OHBA (1) or betuligenol (or rhododendrol) (2) as a precursor (**Figure 1**) (Fischer et al., 2001; Beekwilder et al., 2007).

The berry-derived precursor **1** is not abundant in nature, but it can be produced by bacterial cultivation via condensation of hydroxybenzaldehyde and acetone (Feron et al., 2007). **2** is a secondary alcohol isolated originally from *Taxus wallichiana* (Chattopadhyay et al., 2001). It is also found, in e.g., birch bark, rhododendron, alder, maple and fir, mainly in its glycosylated form called betuloside. Betuloside can be converted into betuligenol, by e.g., microbial β-glucosidase (Dumont et al., 1996). Bioconversion of **2** into **3** has been successfully achieved by various microbial cells (Dumont et al., 1996; Kosjek et al., 2003) and *Atropa belladonna* hairy roots (Srivastava et al., 2013). In a study by Kosjek et al. (2003), the oxidation reaction from **2** to **3** was performed in the actinomycete *Rhodococcus* by using acetone as a hydrogen acceptor. However, when *A. belladonna* hairy roots were fed with **2**, both **3** and betuloside were formed without the requirement of an additional co-substrate. On the other hand, Fujita et al. (1998) showed that callus cultures of *Acer nikoence* (Nikko maple) converted fed **3** to **2** and their glycosides. Interestingly, **3** was only found in the culture medium, whereas glycosides were present in the intracellular space. The authors suggested that a certain specific alcohol dehydrogenase (ADH) and a glycosyltransferase participate in these reactions. Bioconversion of **1** was accomplished with various microbial cells, yielding raspberry ketone with varying conversion efficiencies (Fuganti and Zucchi, 1998). They observed that with longer incubation times, conversion of **1**–**3** continued to formation of **2**. It is known that a reaction from **1** to **3** is catalyzed by an NADPH-dependent enzyme, which was characterized from *Rubus idaeus* by Koeduka et al. (2011). On the other hand, the conversion step from **2** to **3** has not yet been characterized. Indeed, Beekwilder et al. (2007) showed that *Escherichia coli* possesses an endogenous reductase activity to convert **<sup>1</sup>** into **<sup>3</sup>**, after *<sup>p</sup>*-coumaric acid feeding to the cells expressing BAS. It is thus likely that the reductase activity required for this conversion is rather widely distributed in nature. In this study we present bioconversion studies related to raspberry ketone, performed with various plant cell cultures derived from plant species unrelated to raspberry.

#### MATERIALS AND METHODS

#### Plant Material and Precursors

Hairy roots of *Nicotiana tabacum* SR1 'Petite Havana' (tobacco, VTT Culture Collection no. VTTCC P-120068, Supplementary Figure S3) and *Catharanthus roseus* (Madagascar periwinkle, VTTCC P-120070) as well as *N. tabacum* cell suspension cultures SR1 (VTTCC P-120003) and BY-2 (VTTCC P-120001) were initiated and maintained as described by Häkkinen et al. (2012). Hairy roots of *Hyoscyamus muticus* (Egyptian henbane, VTTCC P-120039, Supplementary Figure S4) were initiated and maintained as described in Häkkinen et al. (2005). *Hordeum vulgare* 'Pokko' cell suspension culture (barley, VTTCC P-120080) was established and maintained as described in Ritala et al. (1993). *Rubus idaeus* (raspberry, VTTCC P-120090), *Rubus chamaemorus* (cloudberry, VTTCC P-120083, Supplementary Figure S2) and *Rubus arcticus* (arctic bramble, VTTCC P120089) were maintained as described in Nohynek et al. (2014) with slight modifications (Supplementary Figure S5). *Vaccinium myrtillus* (bilberry, VTTCC P-120045) cell suspension was maintained in modified McCown Woody Plant medium (McCown and Lloyd, 1981). *Plumbago auriculata* (leadwort, VTTCC P-120006) cell suspension culture was maintained in modified Murashige and Skoog's medium (Murashige and Skoog, 1962). *N. benthamiana* was cultivated according to Joensuu et al. (2010). Precursors *p*-hydroxybenzalacetone (4-OHBA) and betuligenol (rhododendrol) were purchased from Sigma–Aldrich (S350656, USA) and TCI America (R0121, USA), respectively. Raspberry ketone was obtained from Sigma–Aldrich (W258806, USA).

# *Ri*ZS1 Plant Vector Construct

*Rubus idaeus* ketone/zingerone synthase 1 (*Ri*ZS1) with NCBI Accession no. JN166691.1 was ordered from GenScript USA Inc. (USA) and cloned into the Gateway-R plant compatible vector pK2GW7 (Life TechnologiesTM) according to manufacturer instructions. The resulting vector carrying 35*S*-*Ri*ZS1 was transformed into *Agrobacterium rhizogenes* LBA9402 and *A. tumefaciens* LBA4404 by electroporation. The presence of the transgene was confirmed by PCR using Gateway-R primers 5 -GGGGACAAGTTTGTACAAAAAAGCAGGC-3 and 3 -GGGGACCACTTTGTACAAGAAAGCTGGG-5 for amplification of the vector region between ATTB1 and ATTB2 sites.

#### Transient Expression in *N. benthamiana*

The *Agrobacterium* suspensions were infiltrated into the leaves of 6-week-old *N. benthamiana* plants as described previously (Joensuu et al., 2010). Briefly, *N. benthamiana* plants were cultivated in a greenhouse for 6 weeks before infiltration. One day prior to infiltration a bacterial suspension was inoculated and incubated at +28◦C overnight. The cultures were diluted to OD600 0.35 with 10 mM MgSO4, 10 mM MES buffer. Bufferinfiltrated leaves were used as control. LBA4404 carrying p19 silencing suppressor was tested for enhanced *Ri*ZS1 expression. Bacterial suspensions carrying p19 and *Ri*ZS1 were applied together by mixing the suspensions either by adding both in equal amounts (1:1 v:v) or adding one fifth of p19 (1:4 v:v). Each infiltration culture was applied to five different leaves at 1 ml total volume per leaf. After infiltration, the leaves were blotted dry with paper tissue and the plants were transferred back to the greenhouse. Sampling was performed after incubation for 6 days. Altogether 20 leaf disks (15 mm ∅) per sample were collected and placed on a petri dish. Substrate (1 mM 4-OHBA) was diluted in 20 mM phosphate-citrate buffer (pH 7.4), and petri dishes were subjected to vacuum conditions twice. After 2 days of incubation, the samples were frozen in liquid nitrogen and stored at −80◦C until analyses. Each sample was taken in triplicate.

#### Biotransformation

Cell suspension cultures were inoculated for 5 days and hairy roots for 9 days prior to feeding, as described in Häkkinen et al. (2012). Precursor was diluted in sterile water and was fed at a final concentration of 100, 300, or 500 μM. Samples were collected after 1, 2, or 5 days and medium was separated from cells by filtering. Cell and medium samples were frozen and lyophilized before extraction.

#### Sample Extraction and GC-MS Analyses

Plant material was grinded with a Retsch mill (MM301, GWB, Germany) into a fine powder. Samples were weighed (50 mg lyophilized material) and 2 ml ultra-pure water was added. Alternatively, 3 ml medium samples were taken for the extraction and spiked with internal standard (20 μg heptadecanoic acid). Raspberry ketone was extracted twice with 5 ml ethyl acetate in an ultrasonic bath (10 min, +25◦C). The supernatants were separated and combined after centrifugation (3000 rpm, 10 min) and evaporated to dryness under nitrogen flow. The residues were dissolved in dichloromethane (50 μl; DCM) and trimethylsilylated with MSTFA (25 μl; *N*-Methyl-*N*-(trimethylsilyl) trifluoroacetamide; Pierce, Rockford, IL, USA) at 80◦C for 20 min. Deglycosylation was performed with plant material into citrate-phosphate buffer (pH 5.4). After sonication (15 min), 500 μl Viscozyme L (V2010, Sigma–Aldrich, USA) was added and a heptane-layer together with internal standard was added on top of the solution. Samples were incubated at +37◦C overnight and the aqueous phase was extracted with ethyl acetate as described above. Medium samples were extracted accordingly.

The samples were analyzed with an Agilent 7890A GC combined with a 5975C mass selective detector. The GC was equipped with an Agilent DB-5MS fused silica capillary column (30 m, 0.25 mm ID, phase thickness of 0.25) and the temperature program was from 70◦C (1 min) to 280◦C (18 min) at 10◦C min−1. Aliquots of 1 μl were injected and the split ratio was 25:1. The data were collected over a mass range of 40– 600 m/z. Identification of the compounds was based on retention times and library comparison (NIST '08, Scientific Instrument Services, Inc., Ringoes, NJ, USA). Calibration curves of reference substances were used for quantification.

# RESULTS

# Transient Expression of *Ri*ZS1 in *N. benthamiana*

Transient expression in *N. benthamiana* was performed using *A. tumefaciens* LBA4404 carrying either the p19 silencing suppressor (Silhavy et al., 2002) or 35*S*-*Ri*ZS1, alone or in combination. Two different dilutions of silencing suppressor p19 were tested against 35*S*:*Ri*ZS1 since the concentration required for optimal suppression activity was not known. Bioconversion of **1** was assayed after incubating leaf samples for 2 days. Product **3** accumulation was observed in all samples which received substrate **1** (**Figures 2A,B**). Only those samples without added substrate were completely devoid of **3**. However, to our surprise, control samples, i.e., samples in which only p19 was infiltrated, also contained raspberry ketone in similar amounts to those observed in *Ri*ZS1 infiltrated samples. Furthermore, p19 did not

appear to have any effect on the raspberry ketone accumulation levels.

# Raspberry Ketone Production in Tobacco Hairy Roots

The known 4-OHBA reductase *Ri*ZS1 was cloned in the Gateway-R plant-compatible overexpression vector and introduced to *N. tabacum* by *A. rhizogenes*. Altogether ten *N. tabacum* hairy root clones were generated carrying *Ri*ZS1 and they were subsequently screened for raspberry ketone production capacity. Hairy roots were first pre-cultivated for 9 days as described in Häkkinen et al. (2012) in order to increase the biomass as well as to reach the exponential growth phase for active intracellular metabolism. Raspberry ketone **3** was produced in three clones after feeding **1** and the best-producing clone was selected for further experiments. When hairy roots were fed with the substrate **1**, the majority of the subsequently produced **3** was secreted to the extracellular space. The accumulation of **3** was monitored for 6 days and it showed that the accumulation peak of **3** occurred already at

day 1. Increased concentration of substrate resulted in an average (*N* = 3) of up to 3.0 mg/l raspberry ketone production from <sup>150</sup> <sup>μ</sup><sup>M</sup> **1** (**Figure 3**). A yield of up to 5.5 mg/l was obtained after feeding 200 μM, but the variation between the biological replicates was too high for reliable interpretation of the result.

In order to confirm the hypothesis drawn from the earlier *N. benthamiana* experiment, wild type (WT) hairy roots were also tested for their bioconversion capacity of **1**. We did indeed find that WT roots converted **1**–**3** at similar conversion rates as roots carrying *Ri*ZS1 (**Figure 3**). However, only three out of ten roots carrying *Ri*ZS1 produced **3** after feeding. The majority of produced **3** was secreted to the culture medium; only up to 0.2% of the whole raspberry ketone pool was found in the intracellular fraction. Fed **1** did not show spontaneous conversion into **3** during the experimental time period as tested by cell-free incubation of substrate in the medium. Furthermore, the majority of produced raspberry ketone in tobacco hairy roots was present as aglycone both in the intra- and the extracellular samples.

Hairy roots were also pre-cultivated for 21 days before feeding with **1**, in order to confirm the optimal growth phase for starting the feeding. However, raspberry ketone was not produced at all in these hairy roots, which had reached the stationary growth phase, thus confirming the necessity for rapidly dividing cells and high overall enzymatic activity during feeding.

In order to study whether elicitation could result in increased bioconversion yields in tobacco hairy roots, methyl jasmonate (MeJA) was applied simultaneously with the substrate. However, it was observed that elicitation with MeJA did not increase the amount of raspberry ketone produced (Supplementary Table S1).

#### Bioconversion of Betuligenol and 4-OHBA by Various Plant Cell Suspensions and Hairy Root Cultures

To test the hypothesis of widely distributed plant reductase activity which is able to accept and convert raspberry ketone precursors, we screened several plant cell cultures for their

#### TABLE 1 | Amount of raspberry ketone produced by selected cell suspensions and hairy root cultures 1 day after feeding with 100 **µ**M 4-OHBA or betuligenol, respectively.


*ND, not detected; tr, trace, <0.05 ppm.* ∗*Mean calculated from six individual clones (2.3–12.0 ug/g DW and 0.1–1.8 mg/l).* ∗∗*freshly prepared suspensions from calli.*

TABLE 2 | Bioconversion of 500 **µ**M 4-OHBA and betuligenol by plant cell suspension cultures 1 day after feeding, unless otherwise indicated.


∗*sample taken after 5 days; ND, not detected; tr, trace, <0.05 ppm.*

bioconversion capacity. Altogether seven undifferentiated and three differentiated cell cultures from five different plant families were tested (**Table 1**). It was clearly observed that the majority of cultures were able to convert either one or both of the tested substrates into raspberry ketone. Since most of the produced **3** was found in the culture medium, the best converters were cloudberry suspension and *C. roseus* and *N. tabacum* hairy roots (**Table 1**). In this list, *C. roseus* cell suspension was the only culture which did not show conversion of either substrate.

After the initial screening, more detailed bioconversion studies were conducted with selected suspension cultures. Altogether six cell suspension cultures, four included in the first screening (barley, cloudberry, arctic bramble, raspberry) plus bilberry and tobacco BY-2, were subjected to further studies, including the testing of different substrate concentrations and sampling points.

Among all six cell suspensions only bilberry did not convert either **1** or **2**. Increasing levels of substrate resulted in close to dose-dependent accumulation of **3** in all other cultures tested, except for barley and arctic bramble (**Table 2**).

The three tested *Rubus* species showed rather different patterns of accumulation of **3**. Cloudberry converted both **1** and **2** at almost equal rates; raspberry showed a preference for **1** and arctic bramble was able to convert both substrates; however, the product accumulated only in the intracellular space (**Table 2**). Based on the earlier reports of a hydrogen acceptor requirement during betuligenol bioconversion, acetone was added to the incubation mixture of hairy roots used in this study. However, this did not result in increased production of **3** (Supplementary Table S2). Among the cultures tested in this study, *C. roseus* and *V. myrtillus* cell suspension cultures did not produce **3** (**Tables 1** and **2**).

By far the best conversion of both **1** and **2**–**3** was achieved with *N. tabacum* BY-2, raspberry and cloudberry cells (**Table 2**), with tobacco BY-2 showing up to 12% total conversion rate. All the cultures tested, except arctic bramble, accumulated more than 75% of the product in the culture medium (**Table 2**).

The temporal accumulation pattern observed for *N. tabacum* BY-2 (**Figure 4**) shows that the highest amount of **3** accumulated both in the intra- and extracellular space 1 day after substrate

feeding. After 1 day the level of **3** declined, possibly due to degradation or further metabolism. In the case of barley and arctic bramble the highest accumulation of **3** occurred only after 5 days. Tyrosol, a suggested metabolite resulting from degradation of **3**, was not detected in cell culture samples tested in this study, based on the mass fragmentation patterns reported by Angerosa et al. (1995).

#### DISCUSSION

#### Tobacco as a Production Platform for Raspberry Ketone

As expected, *N. benthamiana* did not accumulate **3** without added substrate, since tobacco is not known to possess the whole pathway leading to raspberry ketone. However, to our surprise, **3** also accumulated in p19 infiltrated control samples in amounts similar to those in *Ri*ZS1 infiltrated samples. Similarly, in *N. tabacum* hairy root cultures, extracellular concentration of converted **3** was similar in WT hairy roots (2.8 mg/l) and hairy roots carrying *Ri*ZS1 (3.0 mg/l) (**Figure 3**; **Table 1**). *Ri*ZS1 belongs to the medium chain reductase/dehydrogenase (MDR)/zinc-dependent ADH–like family of proteins, a diverse group of proteins related to class I mammalian ADH. MDR proteins constitute a large enzyme superfamily with close to 1000 members (Nordling et al., 2002), and display a wide variety of activities including ADH, quinone reductase, cinnamyl reductase and numerous others. It is interesting to note that the protein sequence of *Ri*ZS1 exhibits 76% similarity and 74% nucleotide sequence similarity with *N. tabacum* allyl ADH (DDBJ/EMBL/GenBank accession nr. AB036735) (Supplementary Figure S1).

This tobacco gene was later renamed as *Nt*DBR (*N. tabacum* Double Bond Reductase) by Mansell et al. (2013), making it a prime candidate responsible for the observed activities in WT *N. benthamiana* and *N. tabacum* hairy roots. *Nt*DBR has 70% homology to the NADPH-dependent oxidoreductases belonging to a plant ζ-crystallin family [leukotriene B4 dehydrogenase family (LTD), a subfamily of the MDR superfamily proteins] and it catalyzes a reversible dehydrogenation of allylic alcohols or ketones (Hirata et al., 2000). Both *Ri*ZS1 and *Nt*DBR belong to the LTD family and both have an NADPH co-factor binding site AXXGXXG.

The accumulation peak of **3** in tobacco hairy roots and cell suspensions, as well as in most other cultures tested, was observed already at day 1 (**Figure 4**). This is in accordance with the results obtained with *A. belladonna* by Srivastava et al. (2013), i.e., bioconversion taking place already at day 1 although the maximum conversion was obtained at day 5. Since only three out of ten roots carrying *Ri*ZS1 actually produced **3** after feeding, it is suspected that co-supression (Vaucheret and Fagard, 2001) occurred in the remaining transgenic clones, resulting most probably from a high sequence similarity between *Ri*ZS1 and corresponding endogenous reductases (Supplementary Figure S1). Production levels up to 5.5 mg/l of **3** were recorded for hairy roots after feeding 200 <sup>μ</sup>M of **1**. These levels correspond to those reported by Beekwilder et al. (2007) with 5 mg/l production rates obtained by *E. coli* expressing chalcone synthase from *Rubus idaeus*. In fresh raspberry fruits, levels of 0.01–0.17 μg/g have been reported (Hrazdina and Zheng, 2006). Accumulation levels up to 20 μg/g have been obtained in elicited raspberry cell cultures (Hrazdina and Zheng, 2006).

Glycosylation is a very important detoxification mechanism in xenobiotic metabolism (Schmidt et al., 2006; Häkkinen et al., 2012). In addition, glycosylation facilitates the conversion of water-insoluble substances into water-soluble compounds and may also be needed in aiding the transport of the particular compound to e.g., vacuole or apoplastic space. Particularly, glycosylation by cultured plant cells has been the subject of increasing attention, since one-step enzymatic glycosylations by plant cells are more convenient than chemical glycosylations, which require tedious steps of protection and deprotection of the sugar hydroxyl groups. For this reason possible glycosylated conjugates of raspberry ketone were screened following enzyme-assisted deglycosylation. However, the main part of **3** was present as aglycone in tobacco hairy roots. Glycosylation of fed raspberry ketone was earlier demonstrated by Shimoda et al. (2007) with cultured cells of *Phytolacca americana*, which converted fed raspberry ketone into β-glycosides after hydroxylation during a 3 day incubation period.

Altogether, the conversion efficiency of hairy roots was much higher than that of infiltrated leaves, 12% versus 2%, respectively.

The difference is even more striking if the incubation time is taken into account, since hairy roots were sampled after 1 day and infiltrated leaves after 2 days. It is noteworthy that the bioconversion yield obtained in this study with *N. tabacum* hairy roots is higher than the bioconversion of betuligenol reported by Srivastava et al. (2013), who obtained an overall 0.5% bioconversion rate at day 1 and a maximum of 7% bioconversion after 5 days with *A. belladonna* hairy roots (calculated from the presented data). The microbial bioconversion capacity of *E. coli* converting **1**–**3** was reported as 40% after 1 day by Beekwilder et al. (2007), whereas bioconversion of **2**–**3** using *Rhodococcus* cells in buffer with 10% (v/v) acetone was as high as 89% after 2 days (Kosjek et al., 2003). As a conclusion, although unable to compete with microbial bioconversion systems in terms of yield, *N. tabacum* hairy roots offer an efficient plantbased bioconversion platform for production of natural raspberry ketone.

## Betuligenol and 4-OHBA are Converted to Raspberry Ketone by Various Plant Species

Hairy Roots Bioconversion capacity differed even between closely related plant species, e.g., the two Solanaceae. Whereas *N. tabacum* hairy roots were able to convert betuligenol, the related *H. muticus* did not show bioconversion. Earlier, Srivastava et al. (2013) reported that hairy roots of another Solanaceae, *A. belladonna*, were able to convert **2** to raspberry ketone. Since Kosjek et al. (2003) had reported from work with bacterial cultures that **2** would require acetone as a hydrogen acceptor, hairy roots were fed with betuligenol together with acetone at a concentration of 1% (v/v). However, the efficiency of bioconversion was not improved by acetone addition. This is in accordance with reported betuligenol bioconversion in *A. belladonna*, which performed the bioconversion as such (Srivastava et al., 2013). This difference between prokaryotic and plant platforms may be explained by the differences in oxidative enzymes. Plants possess a huge variety of cytochrome P450 enzymes together with an abundant hydrogen acceptor pool, carrying out diverse oxidative reactions, whereas the number of P450 enzymes in prokaryotes is generally much lower, and *E. coli* does not possess any at all (Werck-Reichhart and Feyereisen, 2000).

Cell Suspensions Tobacco BY-2 performed best in this study, with 12% total conversion rate, which is higher than the earlier reported plantbased conversion by *A. belladonna* hairy roots (7%, Srivastava et al., 2013). *Rubus* sp., namely raspberry and cloudberry constitute further potent plant platforms (**Table 2**). The cell density of highly multiplying BY-2 culture is high during the exponential growth phase. However, the total amount of biomass at the time of sampling in tobacco BY-2 was approximately twofold compared to other cultures tested. Thus, the number of cells performing the bioconversion cannot be the only reason for the high bioconversion rates in tobacco BY-2. It should also be noted that different cell cultures might possess different optimal growth stages for specific bioconversions, and thus accurate comparisons of the bioconversion potential of different cultures cannot be made without more detailed studies.

Accumulation in culture medium, as seen in all the cultures tested except for arctic bramble, is a highly appreciated phenomenon compared to the typical intracellular location of secondary metabolites. Although product degradation may occur as a function of time (**Figure 4**), by timing the product recovery correctly the yields of produced **3** remain rather high. Earlier it had been suggested that raspberry ketone is degraded into tyrosol in fungi (Fuganti et al., 1996), but tyrosol was not detected in our cell culture samples.

The demand for "natural" raspberry ketone is growing considerably, partly due to the recent findings related to its favorable properties related to weight regulation and skinlightening (Morimoto et al., 2005; Park, 2010; Lin et al., 2011). Bioconversion is an efficient and 'green' technology to convert various substrates into more valuable or less toxic compounds. In this work we have shown that a wide variety of plant cell cultures can be utilized for bioconversion purposes, allowing production in a contained environment, independent of environmental conditions and free of pesticides and contaminants. In the case of raspberry ketone, a very high-value natural flavor substance can be produced in plant cell cultures by applying 4 hydroxybenzalacetone or betuligenol, both of which are rather cheap and readily available precursors. Accumulation in the extracellular space, as shown in this study, is beneficial for compound recovery. Downstream processing may account for as much as 80% of overall production costs and for this reason less complex compound isolation and purification from the culture medium has a major impact on the total costs of a biotechnological process.

#### AUTHOR CONTRIBUTIONS

SH, TS-L, K-MO-C, HR designed the research; SH and TS-L performed the research; SH, TS-L, HR analyzed data; SH, HR wrote the paper.

#### ACKNOWLEDGMENTS

We thank Airi Hyrkäs, Jaana Rikkinen, and Anna-Liisa Ruskeepää for excellent technical assistance. This work was supported by the Academy of Finland (grant 138808 to HR) and VTT. The authors acknowledge the support of COST Action FA1006 PlantEngine.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015.01035

# REFERENCES


the alpha, beta-hydrogenation of phenylbutenones in raspberry fruits. *Biochem. Biophys. Res. Commun.* 412, 104–108. doi: 10.1016/j.bbrc.2011. 07.052


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Häkkinen, Seppänen-Laakso, Oksman-Caldentey and Rischer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Gene-to-metabolite network for biosynthesis of lignans in MeJA-elicited *Isatis indigotica* hairy root cultures

Ruibing Chen1 †, Qing Li 2 †, Hexin Tan<sup>1</sup> , Junfeng Chen<sup>2</sup> , Ying Xiao<sup>2</sup> , Ruifang Ma<sup>3</sup> , Shouhong Gao<sup>2</sup> , Philipp Zerbe<sup>4</sup> , Wansheng Chen<sup>2</sup> and Lei Zhang<sup>1</sup> \*

*<sup>1</sup> Department of Pharmaceutical Botany, School of Pharmacy, Second Military Medical University, Shanghai, China, <sup>2</sup> Department of Pharmacy, Shanghai Changzheng Hospital, Second Military Medical University, Shanghai, China, <sup>3</sup> School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Shenyang, China, <sup>4</sup> Department of Plant Biology, University of California, Davis, Davis, CA, USA*

#### *Edited by:*

*Edward Rybicki, University of Cape Town, South Africa*

#### *Reviewed by:*

*Kirsi-Marja Oksman-Caldentey, VTT Technical Research Centre of Finland, Finland Sumit G. Gandhi, CSIR-Indian Institute of Integrative Medicine, India*

#### *\*Correspondence:*

*Lei Zhang leizhang100@163.com*

*† These authors have contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 17 May 2015 Accepted: 19 October 2015 Published: 03 November 2015*

#### *Citation:*

*Chen R, Li Q, Tan H, Chen J, Xiao Y, Ma R, Gao S, Zerbe P, Chen W and Zhang L (2015) Gene-to-metabolite network for biosynthesis of lignans in MeJA-elicited Isatis indigotica hairy root cultures. Front. Plant Sci. 6:952. doi: 10.3389/fpls.2015.00952* Root and leaf tissue of *Isatis indigotica* shows notable anti-viral efficacy, and are widely used as "Banlangen" and "Daqingye" in traditional Chinese medicine. The plants' pharmacological activity is attributed to phenylpropanoids, especially a group of lignan metabolites. However, the biosynthesis of lignans in *I. indigotica* remains opaque. This study describes the discovery and analysis of biosynthetic genes and AP2/ERF-type transcription factors involved in lignan biosynthesis in *I. indigotica*. MeJA treatment revealed differential expression of three genes involved in phenylpropanoid backbone biosynthesis (*IiPAL*, *IiC4H*, *Ii4CL*), five genes involved in lignan biosynthesis (*IiCAD*, *IiC3H*, *IiCCR*, *IiDIR,* and *IiPLR*), and 112 putative AP2/ERF transcription factors. In addition, four intermediates of lariciresinol biosynthesis were found to be induced. Based on these results, a canonical correlation analysis using Pearson's correlation coefficient was performed to construct gene-to-metabolite networks and identify putative key genes and rate-limiting reactions in lignan biosynthesis. Over-expression of *IiC3H*, identified as a key pathway gene, was used for metabolic engineering of *I. indigotica* hairy roots, and resulted in an increase in lariciresinol production. These findings illustrate the utility of canonical correlation analysis for the discovery and metabolic engineering of key metabolic genes in plants.

Keywords: *Isatis indigotica,* AP2/ERF, biosynthesis of lignans, gene-metabolic network, metabolic engineering

#### INTRODUCTION

Isatis indigotica Fortune has been used in traditional Chinese medicine for more than two millennia and is listed in the Chinese Pharmacopoeia (National Pharmacopoeia Committee, 2010). The root and leaves of I. indigotica demonstrate notable anti-viral (Chang et al., 2012), anti-inflammatory (Tang et al., 2014), anti-tumor (Chung et al., 2011), and antianaphylaxis (Recio et al., 2006) activity, and are used in clinical applications as "Banlengen" and "Daqingye," respectively. In previous researches, lignans including lariciresinol and larch lignan glycosides were considered as the material base of those activities (Yang et al., 2013). However, the biosynthesis of lignans in I. indigotica is largely unresolved. Transcriptome analysis of I. indigotica (Chen et al., 2013) and availability of the complete genomes of other lignan-forming plant species (A. thaliana and Chinese cabbage) offer the opportunity to employ bioinformatics tools for better understanding and ultimately modulating lignan metabolism in I. indigotica. In addition, key genes responsible for the biosynthesis of backbone structures of phenylpropanoids, flavonoids, lignans and lignins have been established, including phenylalanine-ammonia lyase (PAL), cinnamate-4-hydroxylase (C4H) and coumaroyl-CoAligase (4CL) of phenylpropanoid metabolism, chalcone synthase (CHS), flavonol synthase (FNS) and chalcone isomerase (CHI) of flavonoid biosynthesis, and cinnamoyl alcohol dehydrogenase (CAD) and cinnamoyl-CoA reductase (CCR) in lignan formation (**Figure 1A**).

TFs are essential for the coordination of metabolic pathways involved in plant development and environmental stress responses to, for example, drought, salt stress, high temperature, and other abiotic perturbations (Li et al., 2014a,b; Tavakol et al., 2014). Containing at least one AP2 DNA-binding domain, AP2/ERF transcription factors form an important TF superfamily with roles in biotic and abiotic stress responses (Filiz and Tombuloglu, 2014; Lee et al., 2014 ˘ ). AP2/ERF TFs are divided into four families, ERF, AP2, RAV, and Soloist (Thamilarasan et al., 2014). The ERF family further comprises two subfamilies, ERF and DREB. AP2 and RAV contain two domains, comprised of two AP2 domains in members of the AP2 family, while members of the RAV family contain one AP2 domain and one B3 domain (Song et al., 2013; Sun et al., 2014). Despite a high sequence identity, the members AP2/ERF family show a large diversity regarding their DNA-binding motifs (Qin et al., 2007; Hong et al., 2009; Wang et al., 2011) and functions (Hong and Kim, 2005; Ito et al., 2006; Fujita et al., 2011; **Table 1**).

Essential roles for AP2/ERF TFs (AP2/ERFs) in the response to abiotic (drought, salt, and temperature) and biotic stress factors have been demonstrated for numerous plant species, such as rice, tobacco, and tomato (Pan et al., 2012; Zhang et al., 2013, 2014; Wu et al., 2014). In addition, some AP2/ERFs, ORA47 (Pauwels et al., 2008) in Arabidopsis thaliana and RAV1 (Himi et al., 2011) in wheat, were reported to have the possibility of interaction with genes of biosynthesis of lignins and flavonoids. However, a role of AP2/ERFs in lignan biosynthesis has so far not been investigated. Conversely, the role of phytohormones, including methyl jasmonate (MeJA) (Yan et al., 2014), salicylic acid (SA) (D'Maris et al., 2011), and abscisic acid (ABA) (Finkelstein, 2013), in the regulation of phenylpropanoid biosynthetic pathways has been established (Agrawal et al., 2014; Liu et al., 2014). We therefore assume that a "bridge" consisting of AP2/ERFs, phytohormones and biosynthetic genes, connects


the environment stresses and phenylpropanoids metabolism (**Figure 2**) (Pré et al., 2008; Zhang et al., 2012).

In this study, 112 putative AP2/ERFs were identified in I. indigotica and analyzed using a bioinformatics approach. This included the analysis of physicochemical properties of individual AP2/ERFs, phylogenetic studies comparing AP2/ERF orthologs of I. indigotica, A. thaliana, and B. rapa. Transcript profiling revealed differential expression patterns of select AP2/ERF candidates. In addition, key genes (biosynthetic genes and AP2/ERFs) observed to significantly impact lignan biosynthesis were identified by correlating transcript and metabolite analyses of MeJA-treated tissues. These results enabled the selection of high-probability genes, and the downstream metabolic engineering of lignan biosynthesis in I. indigotica hairy roots. Here, over-expression of IiC3H increased lariciresinol production by 4.5-fold.

#### MATERIALS AND METHODS

#### Plant Material

Plants of I. indigotica were grown at university greenhouses (Second Military Medical University, Shanghai, China,). Species verification was performed by Professor Hanming Zhang of the School of Pharmacy (Second Military Medical University).

The sterile I. indigotica plants were grown and kept in our greenhouse. The sterile leaf sections were submerged in the bacterial suspension for 30 min to induce hairy roots of I. indigotica, which were then placed on MS medium supplemented with 30% sucrose, 0.8% agar (pH 5.8), at 25◦C and under dark conditions. Cultures were then washed three times with 60 mL sterilized water, blot-dried on sterile filter paper, and transferred to <sup>1</sup>/<sup>2</sup> MS medium (as above) and supplemented with 500 mg·L −1 cefotaxime after 3 days. After 3 weeks, hairy roots were isolated from leaves and cultivated for 3–4 weeks (25◦C, darkness) on solid 1/2 MS medium (as above) with successive subcultures being grown on decreasing cefotaxime concentrations (250, 100, 0 mg·L −1 ). Rapidly growing root cultures lacking bacterial contamination were further used to establish hairy root lines. Approximately 200 mg of normally growing hairy roots were inoculated in 200 mL 1/2 MS liquid medium and grown in 250 mL shaking flasks at 100 rpm, 25◦C and darkness. Clonal hairy root cultures were routinely subcultured every 30 days, treated by MeJA and harvested after 60 days.

Treatments were designated: (1) 0.5µM of MeJA (Sigma, USA) dissolved in ethanol was added to 200 mL of 1/2 MS liquid medium; (2) Ethanol at the same volume was added into the control group. After treatment, the plants were harvested at 0, 1, 3, 6, 12, 24, and 36 h. Three independent biological replicates for each group.

#### Identification of AP2/ERFs

For the identification of candidate AP2/ERF genes, a previously established I. indigotica transcriptome inventory was used (Chen et al., 2013). The assembled transcriptome was queried against 159 known A. thaliana AP2/ERF proteins (AtAP2/ERF) retrieved from the Database of Arabidopsis Transcription Factors (DATF, http://datf.cbi.pku.edu.cn/) and 321 Chinese cabbage AP2/ERF proteins (BraAP2/ERF) obtained from the Brassica Database (BRAD, http://brassicadb.org/brad/) to select AP2/ERF gene candidates (TBLASTN with a E-value cut-off of 10−<sup>5</sup> ). After removing sequences with bit scores less than 100 or alignment length less than 100 bp, the left sequences were screened in the Pfam database (pfam, http://pfam.janelia.org/) to identify the AP2/ERF proteins with default parameters. Finally, as a quality check, using the Simple Modular Architecture Research Tool (SMART, http://smart.embl-heidelberg.de/).

#### Sequence Analysis

The full-length ORF sequences of the 112 putative AP2/ERFs were obtained and converted into amino acid sequences by Vector NTI Advance (TM) 11.5 and MEGA 5.05. Using the ProtParam tool (http://web.expasy.org/protparam). Secondary structure of AP2/ERFs were predicted using the Secondary Structure Prediction Method (SOPMA, http://npsa-pbil.ibcp. fr/cgi-bin/npsa\_automat.pl?page=/NPSA/npsa\_sopma.html).

ClustalX 2 was used to accurately identify AP2/ERF domains. Conserved amino acid motifs were identified using Multiple EM for Motif Elicitation (MEME, http://meme.nbcr.net/meme/ cgi-bin/meme.cgi) with default settings. IiC3H and other C3Hs obtained from Genbank were aligned and a Neighbor-Joining (NJ) tree was constructed by MAGA 5.05 (http://www. megasoftware.net/).

# Transcript Abundance of AP2/ERFs in *I. indigotica* Hairy Roots Treated with MeJA

To get insight into the AP2/ERFs' transcript abundance induced with MeJA in I. indigotica, the Illumia RNA-Seq data in previous research was utilized (Chen et al., 2013). The RNA-Seq expression profile data were generated using the Illumia HiSeq™ <sup>2000</sup> platform, and included the hairy roots of I. indigotica treated with MeJA at 0, 1, 3, 6, 12, and 24 h. Zero hour was used as control to normalized expression level data in MultiExperiment Viewer (Saeed et al., 2003).

# Phylogenetic Analysis of AP2/ERFs

The amino acid sequence alignments of AP2/ERF proteins were performed by Clustal W. NJ method with pairwise deletion option in MEGA 5.05 was used to analyze the phylogenetic and molecular evolutionary genetics. Reliability of the tree was estimated using a bootstrap analysis with 1000 replicates. Based on the original dataset, bootstrap values above 50% were added to the tree branches. The AP2/ERFs were searched for duplication events (e < le–10, identity > 90%) in I. indigotica.

# Quantitative Real-time PCR

High quality total RNA (1µg) was used to prepare firststrand cDNA using the TransScript First-Strand cDNA Synthesis SuperMix kit (TransGen Biotech, Beijing, China) following the manufacturer's protocol.

Quantitative real-time PCR (qRT-PCR) was performed according to the manufacturer's instructions using a TP8000 Real-time PCR detection system and the SYBR premix Ex Taq kit (TAKARA, Japan) with the following PCR program: 95◦C for 30 s, followed by 40 cycles of 95◦C for 5 s, 53◦C for 10 s, and 72◦C for 20 s. All PCR reactions consisted of three technical replicates. Transcript abundance of each gene was normalized to ubiquitin with the comparative C<sup>t</sup> method (Livak and Schmittgen, 2001; Udvardi et al., 2008). Oligonucleotides used in this study are given in **Table S1**. Three independent biological replicates for each sample and three technical replicates for each biological replicate were analyzed.

## Metabolites Analysis

Dried hairy roots (50 mg) were ground into a fine powder and extracted twice with 25 mL of 80% methanol under sonication for 30 min. After centrifugation, the supernatant was diluted with 80% methanol to a total volume of 50 mL, and filtered through a 0.22µm organic membrane filter prior to HPLC analysis. HPLC analysis was conducted on an Agilent 1200 series instrument with an Agilent 6410 triple-quadrupole mass spectrometer and an electrospray ionization source (Agilent Corporation, MA, USA). Metabolite separation was achieved on an Agilent ZORBAX SB-C18 column (3.5µm, 2.1 × 150 mm) and an Agilent C18 guard column (5µm, 4.0 × 2.0 mm). The mobile phase was acetonitrile: 5 mM ammonium acetate solution (the concentration of acetonitrile was from 5 to 95% in 1.0 min, v/v) with the flow rate of 0.3 mL·min−<sup>1</sup> and a total run time of 5 min. Metabolite identification and quantification was achieved in multiple reaction monitoring mode (MRM). Characteristic m/z ions are listed in **Table S2**. The samples for qRT-PCR and metabolites analysis were the same.

# Integration of Transcript and Metabolite Analyses

Correlation analysis integrating transcript and metabolite data of control and MeJA-induced hairy root cultures was performed by canonical correlation analysis using Pearson's correlation coefficient (Xiao et al., 2009). Gene-to-metabolite, TF-to-gene and TF-to-metabolite networks were visualized to identify probable key genes in lignan biosynthesis.

# Plant Transformation and Growth of Hairy Root Culture

The full-length IiC3H was inserted into vector pCAMBIA1304 to obtain pCAMBIA1304-IiC3H. Sterile I. indigotica plants were grown and kept in our greenhouse. The disarmed A. tumefaciens strain C58C1 harboring both the A. rhizogenes Ri plasmid pRiA4 (Kai et al., 2009) and plasmid constructed above was used for plant genetic transformation.

The method of growth of transgenic hairy root culture was similar to process in 2.1. However, hygromycin (10 mg·L −1 ) should be added with cefotaxime. Rapidly growing root cultures showing hygromycin resistance and lacking bacterial contamination were further used to establish hairy root lines. Approximately 200 mg of normally growing hairy roots were inoculated in 200 mL 1/2 MS liquid medium and grown in 250 mL shaking flasks at 100 rpm, 25◦C and darkness. Clonal hairy root cultures were routinely subcultured every 30 days and harvested after 60 days.

# PCR Analysis of Hairy Root Culture

Genomic DNA was isolated from hairy root samples using the acetyl trimethyl ammonium bromide (CTAB) method (Doyle and Doyle, 1990). Then the DNA was used in PCR analysis for detecting the presence of the specific genes in transgenic lines. Primer sequences for amplifying these genes (these primers were particularly designed to cover the gene sequence and the vector sequence for detecting exogenous gene transformations) are listed in **Table S1**. The selectable marker hygromycin resistance gene hph was used to check the pCAMBIA1304 vector transformants, whereas Agrobacterium gene rolb and rolc were used to check the transformation of pRiA4 (Chilton et al., 1982). The PCR reaction program: 94◦C for 3 min followed by 35 cycles of amplification (94◦C for 10 s, 58◦C for 30 s, 72◦C for 1 min) with final extension at 72◦C for 5 min.

#### Statistical Analysis

Statistical analysis was performed with SPSS 13.0 software. Analysis of variance (ANOVA) was followed by Tukey's pairwise comparison tests, at a level of p < 0.01, to determine significant differences between means.

# RESULTS

#### Analysis of AP2/ERFs in *I. indigotica* Identification of AP2/ERFs in the I. indigotica Transcriptome

A total of 112 putative AP2/ERFs, designated Ii001 to Ii112, were obtained through query of a previously established I. indigotica transcriptome inventory (Chen et al., 2013) against public AP2/ERF and AP2/ERFs-like protein sequences of A. thaliana and B. rapa by TBLASTN (Basic Local Alignment Search Tool 2.2.26) (**Table S3**). The best hit homology genes of these sequences to A. thaliana and B. rapa were summarized (**Table S3**) and the AP2/ERF proteins were subsequently categorized by domain types (http://pfam.sanger.ac.uk/). A total of 42 ERF, 45 DREB, 20 AP2, 3 RAV, and 2 Soloist gene candidates were identified, all of which contained characteristic domain features (SMART, http://smart.embl-heidelberg.de/) (**Table S3**).

#### Sequence Analysis

Sequence analysis of the 112 identified AP2/ERF demonstrated ORF lengths ranging from 92 aa (Ii015) to 565 aa (Ii037) and the molecular masses varied from ∼10.29 (Ii015) to 625.11 kDa (Ii037) (**Table S4**). This differences are, in part, resulting from incomplete sequencing. The predicted pI values ranged from 4.42 (Ii036) to 11.58 (Ii069) and instability indices varied between 23.98 (Ii055) and 81.28 (Ii015) with an average value of 54.66. Aliphatic indices ranged from 44.33 (Ii111) to 82.48 (Ii069) averaging at 62.54, and hydrophobicity values of all the AP2/ERF proteins were below zero, ranging from -0.079 (Ii112) to -1.17 (Ii097). Secondary structure prediction indicated a predominantly random coils (53.28%), with α-helical folding pattern (28.02%), extended strands (13.68%) and β-turns (5.01%) (**Table S5**).

Computational prediction of the subcellular localization (WoLF PSORT; http://www.genscript.com/psort/wolf\_psort. html) placed the majority of the identified AP2/ERFs at the nucleus with a few gene candidates showing a possible localization in mitochondria, Golgi apparatus, cytoplasm and chloroplasts (**Table S6**). With the exception of Ii095 (for which a 19 aa N-terminal transit peptide was predicted), no signal peptides were observed in the identified AP2/ERF candidates using the NetNGlyc 1.0 server (http://www.cbs.dtu.dk/services/ NetNGlyc/).

#### Phylogenetic Analysis of the I. indigotica AP2/ERF Superfamily

To gain a detailed understanding of evolutionary interrelations and the topological structure of the I. indigotica AP2/ERF protein family a neighbor-joining phylogenetic tree was constructed (**Figure 3**), which contained the DREB and ERF subfamilies, and the AP2, RAV and Soloist families that were further divided into 14 clades (without DREB-A3). Groups I–VI represented the ERF subfamily, groups VII–XI the DREB subfamily, and groups XII, XIII, and XIV comprised the AP2, RAV, and Soloist families, respectively. The DREB subfamily comprised the largest number of members, followed by the ERF, AP2, RAV, and Soloist families.

Duplication events had already been learned in grape and Chinese cabbage. Seventeen and fifteen proteins with sequences of a high similarity were reported, respectively (>95% sequence similarity) (Song et al., 2013). Similarly, this study identified 19 presumably duplicated genes in I. indigotica sharing 95% sequence similarity. Among these genes, 11 were classified as DREB subfamily genes, while the remaining eight genes were annotated as AP2 proteins (**Table 2**).

To comprehensively analyze the evolutionary diversification of the I. indigotica AP2/ERF superfamily an additional phylogenetic tree was generated that compared all 112 identified AP2/ERF proteins of I. indigotica, 289 proteins of B. rapa, and 148 proteins of A. thaliana, inclusive of the DREB and ERF subfamilies, as well as AP2, RAV, and Soloist families that were further divided into 15 subgroups (**Figure S1**). The generated tree illustrated the ERF family (ERF and DREB subfamilies) and the Soloist family as the largest and smallest clusters, respectively. Notably, the ERF subfamily comprised two separate subgroups, which, in turn were divided into six (B1–B6) and two (B1 and B6) clusters, respectively. This result may indicate a more expansive evolutionary divergence of the B1–B6 groups.

To further clarify the relationships among AP2/ERF proteins in I. indigotica, multiple alignment analyses of characteristic AP2/ERF domains were performed for every subfamily. Overall, all proteins showed high sequence similarity and distinct familyspecific domain features. All members of DREB subfamily and most ERF proteins contained a WLG element. In addition, the majority of DREB proteins harbored an EIR element. Most AP2 proteins contained two AP2 domains, with exception of 10 proteins that lacked the second AP2 domain. The latter proteins likely represent partial genes obtained through the transcriptome analysis. Similarly, all members of the RAV subfamily contained one AP2 domain and a B3 domain, except for three proteins that lacked the B3 domain and likely represent partial sequences. In addition, a subset of AP2 proteins contained YRG and YLG motifs.

## MeJA-induced Changes in Lignan Biosynthesis

MeJA treatment of I. indigotica hairy roots cultures was employed to investigate changes in the biosynthesis of lignans (**Figure 1A**). The obtained results illustrated clear MeJA-inducibility of lignan biosynthesis both at the gene expression and metabolite accumulation level.

#### Transcript Profiling Demonstrates MeJA-induced Changes of AP2/ERFs

Potential functions of the 112 putative AP2/ERFs were analyzed using Illumina RNAseq-based gene expression profiling in MeJAtreated I. indigotica hairy roots harvested 0, 1, 3, 6, 12, and 24 h post treatment and compared to non-treated samples. Changes in gene expression levels of AP2/ERFs inducible by MeJA are illustrated as a heat map (**Figure 4**). Of the 112 genes, 27 TFs were excluded from the study. Among the remaining genes, 13 TFs were up-regulated at 1, 3, 6, 12, and 24 h compared with 0 h, while 30 TFs were down-regulated. The remaining 42 TFs were up- or down-regulated at only individual time points. Notably, Ii04 and Ii078 were most highly up-regulated with 8.2- and 7.5-fold, respectively. Conversely, Ii014 and Ii068 were most highly down-regulated with 5.7- and 6.5-fold, respectively.

#### Verification of AP2/ERFs by qRT-PCR

To confirm the gene expression results obtained via RNAseq, 8 AP2/ERs were randomly chosen for additional qRT-PCR analysis. These genes comprised six up-regulated and two downregulated genes upon MeJA treatment. As depicted in **Figure S2**, gene expression levels were comparable between RNA-seqand qRT-PCR-derived results, supporting the reliability of gene expression levels obtained by Illumina transcriptome sequencing.

#### MeJA-induced Changes of Biosynthetic Genes in the Hairy Root Transcriptome

MeJA treatment of I. indigotica hairy root tissue significantly increased expression levels of genes with proposed functions in lignan biosynthesis. Transcript abundance of IiPAL, Ii4CL, IiC4H, IiC3H, IiCAD, IiCCR, IiPLR, and IiDIR were observed to be gradually induced and their sequences were listed in **Table S7**.



Interestingly, Ii4CL and IiPLR were most abundant at 12 h post treatment, while other transcripts showed the highest abundance at 6 h. The levels of gene up-regulation varied from 2.7 (IiPAL, IiPLR), 3.0 (Ii4CL), 3.9 (IiC4H), 11.5 (IiC3H), 4.9 (IiCAD), 6.0 (IiCCR), and 6.1 (IiDIR) fold as compared to time point 0 h (**Figure 1B**).

#### MeJA-induced Changes in the I. indigotica Hairy Root Metabolite Profile

Accumulation of four compounds (coniferin, lariciresinol, secoisolariciresinol, and pinoresinol) as key metabolites in the biosynthesis of lignans was enhanced by MeJA treatment, but at different levels. Coniferin showed the highest accumulation with a 2.1-fold increase after 24 h. The remaining metabolites showed highest abundance already after 12 h with 3.5-, 3.0- and 4.1-fold increases, respectively (**Figure 1C**).

#### Integration of Transcript and Metabolite Abundance Analyses

A canonical correlation analysis using Pearson's correlation coefficient was performed to identify possibly correlations between the transcript profiles of the 112 IiAP2/ERFs and eight biosynthetic genes, and the four investigated metabolites.

As illustrated in **Figure 5A**, the first pair of canonical correlation variables (U and V) revealed a clear correlation between gene transcripts and target metabolites with a canonical correlation coefficient of 0.968. Detailed results of the complete correlation coefficients between raw variables (gene or metabolite) and canonical correlation variables (U or V) are listed in **Tables S8**, **S9**. To further investigate the gene-to-metabolite correlation structure, variable correlation coefficient cut-off values of 0.5 were applied. For example, the variable correlation coefficients showing the significance of correlations between Ii4CL transcript levels and accumulation of four metabolites (coniferin, lariciresinol, secoisolariciresinol and pinoresinol) were −0.23, 0.75, 0.41, and 0.60, respectively. These findings indicated that Ii4CL as a gene involved in the upstream biosynthetic pathway is correlated with lariciresinol and pinoresinol, but not or minimally correlated with coniferin and secoisolariciresinol.

Additional correlation analyses among TFs, biosynthetic genes and pathway intermediates that demonstrated a high average variable correlation coefficient were established in the same manner (**Tables S8**, **S9**). In summary, the performed study resulted in the below observations:


#### Metabolic Engineering with *IiC3H* Overexpression in *I. indigotica* Hairy Root Cultures

Based on its proposed function in lignan biosynthesis, IiC3H was chosen for metabolic pathway engineering toward increased lariciresinol production in hairy root cultures. IiC3H (JF826963) represents a 1527 bp ORF encoding for a predicted 509 amino acid protein. IiC3H contains the characteristic P450 domains and BLAST analysis showed highest similarity to known coumarate 3-hydroxylases from other plant species, including A. thaliana AtC3H (NP\_850337), Populus alba PaC3H (ABY85195), Eucalyptus globules EgC3H (ADG08112), Populus

FIGURE 4 | Cluster analysis of the differentially expressed AP2/ERF genes identified in *I. indigotica*. Hairy roots of *I. indigotica* were treated with MeJA for 0, 1, 3, 6, 12, and 24 h and transcript abundance was measured via Illumina RNAseq analysis. The 0 h time point was used as control. Fold-change differences in transcript abundance are illustrated as heat map on a natural log scale (treatment/control). Samples with non-undetectable signals are depicted in gray.

hairy roots. Genes are depicted as squares on the left, and metabolites on the right. The canonical correlation coefficient between two canonical correlation variables (U and V) was 0.968. The correlation coefficient between raw variables (genes and metabolites) and canonical correlation variables (U and V), is illustrated as corresponding dotted lines. Number associated with lines represent the variable correlation coefficient, and the gene color illustrate the level of gene-to-metabolite correlation: edges depict variable correlation coefficients of >0.50 and blue represents a higher correlation to lariciresinol. (B) List of 19 AP2/ERFs with possible roles in regulating the accumulation of lignans. The left oval shows the result of canonical correlation analysis between AP2/ERFs and pathway metabolites, and the right oval shows the result of the analysis between AP2/ERFs and biosynthetic genes. Different families are color coded. Particularly, four common AP2/ERFs (*Ii*080, 007, 049, and 050) show high probability for functions regulating the biosynthesis of lignans.

trichocarpa PtC3H (XP\_002308860), and Ricinus communis RcC3H (XP\_002526203) (**Figure 6A**).

A neighbor joining phylogenetic tree showed close relatedness of IiC3H and AtF3H, forming a separate cluster from other known plant C3H enzymes (**Figure 6B**). This suggests a possible functional relatedness of both proteins and highlights an

expansive evolutionary diversification of the C3H family from a common P450 ancestor.

Tissue-specific gene expression analysis of IiC3H in roots, stems, leaves, and flowers of I. indigotica using qRT-PCR revealed that IiC3H was expressed predominantly in roots and stems (**Figure 7A**), which is consistent with previous studies

demonstrating roots as the main organ for the synthesis and accumulation of lariciresinol (Chen et al., 2013).

To increase lariciresinol biosynthesis engineered transgenic I. indigotica hairy root lines over-expressing IiC3H were established. Here, the full length ORF of IiC3H was inserted into the NcoI and SpeI sites of the pCAMBIA1304 expression vector (**Figure 8A**). Cultures of I. indigotica hairy roots were cultivated from seeds and transformed using Agrobacterium tumefaciens C58C1 (**Figures 8B–G**). Presence of the pCAMBIA1304-IiC3H in transformed hairy roots was verified via PCR analysis (**Figure 8H**). In six hairy root lines (C1-C6), expression of IiC3H was significantly up-regulated at 4.14-, 1.02-, 1.19-, 1.22-, 1.46-, and 2.21-fold compared to the control (CK), respectively (**Figure 7B**). At the same time, lariciresinol formation was

increased by 4.45-, 0.72-, 1.25-, 3.5-, 4.1-, and 3.9-fold compared to the control in lines C1-C6, respectively. Using this approach, lariciresinol yields were increased from 23.8 to 96.4 mg·g −1 (**Figure 7C**), highlighting the important role of IiC3H in the biosynthesis of lariciresinol and its utility for metabolic pathway engineering.

#### DISCUSSION

Through advanced whole genome sequencing model plants and high-throughput gene annotation function, systems biology and gene and/or metabolite network analyses have become increasingly powerful tools to elucidate the biosynthesis and regulation of plant secondary metabolism.

ERF proteins are known to play significant roles in signaling pathways in environmental interactions and the response to biotic and abiotic stress, as demonstrating through in vivo transgenic approaches in A. thaliana and many crop plants, such as rice (Giuntoli et al., 2014), tobacco (Zhu et al., 2014), and tomato (Klay et al., 2014). Yang and coworkers reported that AtERF073 (AT1G72360) modulated ethylene responses during hypoxia in A. thaliana (Yang et al., 2011). Ii054 showed high homology to AtERF073, suggesting a similar role in the response to hypoxia in I. indigotica. Furthermore, high homology of Ii109 with AtERF53 (AT2G20880), CaMV35S-controlled over-expression of which resulted in an unstable drought-tolerant phenotype in transgenic plants, may support a related functionality in drought tolerance (Cheng et al., 2012). The DREB family represented the largest AP2/ERF subfamily in I. indigotica. DREB proteins have frequently been used as viable candidates for enhancing crop abiotic stress tolerance (Gupta et al., 2014). Within this group, Ii028 was closely related to AtDREB1A (AT4G25480) of A. thaliana involved the response to heat stress (Hong et al., 2009). Similarly, AtDREB19 (At2g38340) and Ii086 are phylogenetically related and may have a similar functionality in enhancing tolerance to high salinity and drought stress (Krishnaswamy et al., 2011).

Members of the AP2 family have been associated with the shape and development of plant organs. For example, three A. thaliana mutants (ap2-5, ap2-6, and ap2-7) exhibited morphological changes of perianth organs (Kunst et al., 1989). Another member of the AP2 family, CRL5, impacted sepal abscission (Yan et al., 2012), plant height (NsAP2) (Luo et al., 2012), and leaf shape (Jiang et al., 2012) in Brassica napus, water lily, and maize.

With respect to the RAV family, recent research on overexpressing A. thaliana RAV1 suggested a role closely associated with leaf maturation and senescence (Woo et al., 2010). Similar roles related to plant senescence can be hypothesized for the members of the RAV family in I. indigotica, such as Ii051 and Ii052.

Although functional knowledge of the Soloist family is presently limited, the A. thaliana Soloist protein At4g13040 was shown to be a positive regulator of SA accumulation and basal defense against bacterial pathogens (Giri et al., 2014). Two homologous genes (Ii049 and Ii050) with possibly related activities were identified in I. indigotica.

As illustrated in **Figure 9**, studies in the model plant A. thaliana illustrated that MeJA-mediated stress responses (typically entailing modulation of different secondary metabolic pathways) proceed via two different but closely connected waves (Pauwels et al., 2008). In the first wave, MeJA induces the expression of select JA-biosynthetic genes. In the second, MeJA induces phenylpropanoid metabolism and other secondary metabolic pathways. Eight different groups of TFs, comprising members of the JAZ/TIFY, AP2/ERF, WRKY, bHLH, MYB, NAC, and C2H2 Zn finger families, were found to be enhanced after MeJA treatment in the first wave. AP2/ERF TFs, as one of major group of TFs together with MYB and bHLH proteins have important functions in biological processes such as stress response and control of secondary metabolism (Dietz et al., 2010; Pires and Dolan, 2010; Rushton et al., 2010). Therefore, it appeared plausible that TFs belonging to these groups would play key roles in stress-induced lignan biosynthesis in I. indigotica.

Recent studies showed that expression of PLOX3: fLUC (a key enzyme in JA biosynthesis) was increased more than threefold when the transcriptional activators ORA47 (an AP2/ERF protein) and MYC2 (a bHLH protein) were over-produced. This over-expression was also accompanied by an induction of

transcriptional regulation (solid lines), and incompletely characterized metabolic or signaling pathways (dashed lines) are highlighted.

phenylpropanoid metabolism in the second wave. In contrast, expression of genes involved in transcriptional regulation was induced in the early wave. Both, ORA47 and MYC2 functioned as positive activators in JA formation, but the underlying mechanism has not been resolved. In wheat and rice, the RAV1 (an AP2/ERF TF) binding site was found in the promoter region of F3H (involved in flavonoid biosynthesis) (Himi et al., 2011). Therefore, AP2/ERFs are capable of coordinating phenylpropanoid metabolism directly through controlling gene expression of biosynthetic genes such as F3H, or indirectly through interaction with other signaling pathways, such as JA biosynthesis.

Based on these previous findings, we set out to investigate if putative AP2/ERFs in I. indigotica, such as ORA47 or RAV1, could play key roles in regulation gene expression and metabolite formation in the biosynthesis of lignans. To address this question, we performed a canonical correlation analyses of AP2/ERFs, lignan biosynthetic genes and pathway metabolites identified to be differentially regulated in I. indigotica hairy roots following MeJA treatment.

For this purpose, transcriptome and metabolite analyses were combined to discover key genes involved in lariciresinol biosynthesis in I. indigotica as an important medicinal plant. This study identified eight putative genes and IiC3H was chosen as an example. Over-expression of IiC3H was successfully employed to increase lariciresinol biosynthesis in transgenic hairy root cultures. In addition, four putative AP2/ERFs (Ii080, 007, 049, 050) were identified that show high probability to be involved in the regulation of lignan biosynthesis through interaction with pathway genes (similar to RAV1 in wheat) or via interaction with other signaling pathways (similar to ORA47 in A. thaliana).

# AUTHOR CONTRIBUTIONS

The study was conceived by RC, WC, and LZ. RC and QL collected the public dataset of A. thaliana and B. rapa. RC, JC, and RM contributed to data analysis, bioinformatics analysis, and manuscript preparation. SG, YX, and QL analyzed the accumulation of compounds through HPLC-MS/MS. RC, QL, and HT participated in planning of analyses and revising the manuscript. All authors have read and approved the final version of the manuscript.

#### ACKNOWLEDGMENTS

The authors greatly appreciated Dr. Han-Ming Zhang for his helpful discussions and reviews of this paper. This work was financially supported by the National Natural Science Foundation of China (Grant No. 31160059, 81303160, 31300159, 81325024, and U1405215); "Pujiang Talent" program (13PJ1411000), Shanghai Science and Technology Development Funds (14QB1402700), and program 15391900500 from Science and Technology Commission of Shanghai Municipality.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00952

Figure S1 | AP2/ERF transcription factor comparisons across different species.

Figure S2 | Quantitative real-time PCR analysis of eight select AP2/ERF transcripts. Validation of the gene expression levels obtained via RNAseq was achieved by quantitative real-time PCR (qRT-PCR) analysis of eight randomly chosen AP2/ERF genes observed to be inducible by MeJA treatment, *n* = 3.

Table S1 | Oligonucleotides used in this study.

Table S2 | Optimized MRM parameters for coniferin, lariciresinol, secoisolariciresinol, and pinoresinol.

#### REFERENCES


Table S3 | Putative homologous genes of 112 AP2/ERF sequences compared to *A. thaliana* and *B. rapa*.

Table S4 | Chemical and physical characteristics of 112 AP2/ERF proteins in *I. indigotica*.

Table S5 | Secondary structure prediction of AP2/ERF proteins.

Table S6 | Prediction of subcellular localization.

Table S7 | Sequences corresponding to the lariciresinol biosynthetic genes of *I. indigotica*.

Table S8 | Canonical correlation analysis.

Table S9 | Analysis of correlation coefficient.

Table S10 | Highest correlation among three compounds, three genes and five TFs.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Chen, Li, Tan, Chen, Xiao, Ma, Gao, Zerbe, Chen and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

#### *Edited by:*

*Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *Reviewed by:*

*Wusirika Ramakrishna, Central University of Punjab, India Ravshan Burikhanov, University of Kentucky, USA Claudia Consales, Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Italy*

#### *\*Correspondence:*

*Nrisingha Dey, Department of Gene Function and Regulation, Institute of Life Sciences, Department of Biotechnology, Government of India, Nalco Square, Chandrasekharpur, Bhubaneswar, Odisha-751 023, India nrisinghad@gmail.com; Indu B. Maiti, Kentucky Tobacco Research & Development Center, Plant Genetic Engineering Research and Services, College of Agriculture, Food and Environment, University of Kentucky, Lexington, KY 40546, USA imaiti@uky.edu*

*†These authors have contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 15 April 2015 Accepted: 22 September 2015 Published: 07 October 2015*

#### *Citation:*

*Sarkar S, Jain S, Rai V, Sahoo DK, Raha S, Suklabaidya S, Senapati S, Rangnekar VM, Maiti IB and Dey N (2015) Plant-derived SAC domain of PAR-4 (Prostate Apoptosis Response 4) exhibits growth inhibitory effects in prostate cancer cells. Front. Plant Sci. 6:822. doi: 10.3389/fpls.2015.00822*

# Plant-derived SAC domain of PAR-4 (Prostate Apoptosis Response 4) exhibits growth inhibitory effects in prostate cancer cells

#### *Shayan Sarkar1†, Sumeet Jain2,3†, Vineeta Rai1, Dipak K. Sahoo4,5, Sumita Raha6, Sujit Suklabaidya2, Shantibhusan Senapati2, Vivek M. Rangnekar7, Indu B. Maiti4\* and Nrisingha Dey1\**

*<sup>1</sup> Department of Gene Function and Regulation, Institute of Life Sciences, Department of Biotechnology, Government of India, Bhubaneswar, India, <sup>2</sup> Department of Translational Research and Technology Development, Institute of Life Sciences, Department of Biotechnology, Government of India, Bhubaneswar, India, <sup>3</sup> Manipal University, Manipal, India, <sup>4</sup> Kentucky Tobacco Research & Development Center, Plant Genetic Engineering Research and Services, College of Agriculture, Food and Environment, University of Kentucky, Lexington, KY, USA, <sup>5</sup> Department of Agronomy, Iowa State University, Ames, IA, USA, <sup>6</sup> Department of Radiation Oncology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA, <sup>7</sup> Department of Radiation Medicine, Markey Cancer Center, University of Kentucky, Lexington, KY, USA*

The gene *Par-4* (Prostate Apoptosis Response 4) was originally identified in prostate cancer cells undergoing apoptosis and its product Par-4 showed cancer specific proapoptotic activity. Particularly, the SAC domain of Par-4 (SAC-Par-4) selectively kills cancer cells leaving normal cells unaffected. The therapeutic significance of bioactive SAC-Par-4 is enormous in cancer biology; however, its large scale production is still a matter of concern. Here we report the production of SAC-Par-4-GFP fusion protein coupled to translational enhancer sequence (5- AMV) and apoplast signal peptide (aTP) in transgenic *Nicotiana tabacum* cv. Samsun NN plants under the control of a unique recombinant promoter M24. Transgene integration was confirmed by genomic DNA PCR, Southern and Northern blotting, Real-time PCR, and Nuclear run-on assays. Results of Western blot analysis and ELISA confirmed expression of recombinant SAC-Par-4-GFP protein and it was as high as 0.15% of total soluble protein. In addition, we found that targeting of plant recombinant SAC-Par-4-GFP to the apoplast and endoplasmic reticulum (ER) was essential for the stability of plant recombinant protein in comparison to the bacterial derived SAC-Par-4. Deglycosylation analysis demonstrated that ER-targeted SAC-Par-4-GFP-SEKDEL undergoes O-linked glycosylation unlike apoplast-targeted SAC-Par-4-GFP. Furthermore, various *in vitro* studies like mammalian cells proliferation assay (MTT), apoptosis induction assays, and NF-κB suppression suggested the cytotoxic and apoptotic properties of plant-derived SAC-Par-4-GFP against multiple prostate cancer cell lines. Additionally, pre-treatment of MAT-LyLu prostate cancer cells with purified SAC-Par-4-GFP significantly delayed the onset of tumor in a syngeneic rat prostate cancer model. Taken altogether, we proclaim that plant made SAC-Par-4 may become a useful alternate therapy for effectively alleviating cancer in the new era.

Keywords: fusion protein, SAC domain of Par-4, transgenic plant, glycosylation, molecular farming, apoptosis

# Introduction

Prostate cancer is the second most common cause of cancer and the sixth leading cause of cancer death among men worldwide (Ferlay et al., 2010; Siegel et al., 2014). Prostate cancer is associated with the inability of prostatic epithelial cells to undergo apoptosis rather than with enhanced cell proliferation. Spontaneous metastasis of tumors from the primary site to distant tissues causes mortality in advanced cancer patients (Cooperberg et al., 2009).

Prostate apoptosis response-4 (Par-4) protein (340 amino acids) is capable of promoting apoptosis in cancer cells and causes regression of tumors in animal models (Gurumurthy and Rangnekar, 2004). The Par-4 protein is ubiquitously expressed in numerous tissues among various species, encoded by pro-apoptotic *Par-4* gene located on human chromosome 12q21, rat chromosome 7q21, and mouse chromosome 10D1 (Johnstone et al., 1996; El-Guendy and Rangnekar, 2003). It is a multi-domain protein that is structurally segmented into leucine-zipper domain (LZ) at the carboxyl terminal region, two nuclear localization sequences (NLS1, NLS2), a nuclear export sequence (El-Guendy et al., 2003) and a unique SAC (Selective for Apoptosis of Cancer Cells) domain including the NLS2 domain (Hebbar et al., 2012). Sells et al. (1994) identified *Par-4* as an immediate early apoptotic gene through differential hybridization screening of rat AT-3 androgen dependent prostate cancer cell line exposed to ionomycin for the induction of apoptosis. Consistent with its pro-apoptotic function, Par-4 is found to be frequently deleted in pancreatic and gastric cancer (Kimura and Gelmann, 2000; Boehrer et al., 2001), down-regulated in renal-cell carcinomas (Cook et al., 1999), neuroblastoma (Kogel et al., 2001), acute lymphoblastic, leukemia, chronic lymphocytic leukemia (Boehrer et al., 2001), endometrial cancer (Moreno-Bueno et al., 2007), and silenced in endometrial cancer cell lines SKUT1B and AN3CA (Moreno-Bueno et al., 2007).

Interestingly, a 59 amino acid long SAC domain (amino acid coordinates 137–195 in rat Par-4; and 145–204 in human Par-4, respectively) of Par-4 is effective in inducing apoptosis in cancer cells. This domain is 100% conserved in human, rat, and mouse homologs (El-Guendy and Rangnekar, 2003). The SAC domain of Par-4 is the main functional unit for the induction of apoptosis in cancer cells (El-Guendy et al., 2003) and its activity depends on its nuclear entry and phosphorylation at Threonine 155 (Zhao and Rangnekar, 2008). Whole Par-4 and its SAC domain (SAC-Par-4) both can induce apoptosis through intrinsic and extrinsic pathways (Burikhanov et al., 2009; Hebbar et al., 2012). Overexpression of Par-4 or SAC domain induces apoptosis in different cancer cell lines but does not kill normal cells in cell culture studies (Burikhanov et al., 2009). In animal model, systemic overexpression of Par-4 is shown to inhibit tumor growth and metastasis (Zhao et al., 2011). A previous study have shown that the full-length Par-4 interacts with Akt1 (cell survival kinase) through LZ domain to confer cancer cells resistant to apoptosis; however, SAC-domain lacking LZ domain could escape binding to Akt1 and can potentially kill cancer cells (Goswami et al., 2005). The ability of SAC-domain to induce apoptosis in diverse cancer cells can be exploited as potential anti-cancer regimen to induce tumor suppression via apoptosis. Thus SAC-Par-4 is gaining world-wide attention as an effective anti-cancer therapeutics; implying necessities for high-scale production of biologically active SAC-Par-4 protein.

Molecular farming of essential therapeutics/drug molecules in plants have several advantages over their conventional production in bacteria, yeast, or cultured animal or human cell lines (Goldstein and Thomas, 2004; Ko et al., 2005; Ma et al., 2005a,b,c; Daniell, 2006; Fox, 2006; Gleba et al., 2007). These studies clearly demonstrated that transgenic plants could become an effective expression system for profitable production of plantmade products (PMP; Goddijn and Pen, 1995; Daniell et al., 2001). Besides, plant ensures hygienic pharmaceutical production avoiding harmful or lethal contaminants like viruses, toxins, prions, oncogenes (Boehm, 2007).

In the present study, we reported the expression of SAC-Par-4-GFP in transgenic tobacco plants under the control of modified full-length transcript promoter (M24) of the *Mirabilis mosaic virus*. Furthermore, the SAC-Par-4-GFP was fused with translational enhancer sequence (5- AMV), apoplast targeting sequence (aTP) to increase the transgene stability in plant and to direct the protein into the secretory pathway and, finally, into the apoplast. The stability and functionality of plantderived SAC-Par-4-GFP was confirmed by different molecular analysis. Alongside, retention to the endoplasmic reticulum (ER) was achieved by adding an ER retrieval signal (SEKDEL) to the C-terminus of the apoplast-targeted SAC-Par-4-GFP construct and transiently expressed in tobacco plant to obtain a glycosylated and proteolytically stable protein. The proteolytic stability of transgenic plant-derived recombinant SAC-Par-4- GFP and transiently expressed SAC-Par-4-GFP-SEKDEL was compared with bacterial SAC-Par-4. Cancer-specific effectiveness and bioactivity of plant-made SAC-Par-4-GFP was confirmed by mammalian cell proliferation assays in PC3, MAT-LyLu, LNCaP prostate cancer cells lines, and HEK293 non-cancerous cells along with apoptosis induction in PC3 and NF-κB suppression activities in PC3, MAT-LyLu, and HEK293 cells. Moreover, the onset of tumor by MAT-LyLu cells pre-treated and co-injected with SAC-Par-4-GFP was studied in Copenhagen rats. Our present findings lay a foundation regarding the anti-prostate cancer activity of plant-derived SAC-Par-4 and hence SAC-Par-4 could become an effective plant-made biologics in controlling prostate cancer. This is a nascent report demonstrating the inhibitory properties of plant-derived SAC-Par-4-GFP on prostate cancer cells' growth.

#### Materials and Methods

#### Construction of Chimeric Gene Constructs for Plant Transformation

The rat *Par-4*-SAC domain (GenBank accession no U05989) was fused with GFP to generate SAC-Par-4-GFP recombinant DNA fragment. This sequence was then codon-optimized for *Nicotiana tabacum*. A 35-nucleotide long 5- -untranslated region of AlMV RNA 4 (5-AMV; translational enhancer) and apoplast targeting sequence (aTP) of *Arabidopsis* 2S2 protein gene (Kroumova et al., 2013) were fused with the normalized coding sequence of SAC-Par-4-GFP. The recombinant sequence thus generated was synthesized by GeneArt (Invitrogen, Life technologies, USA1 ) and cloned into *Xho*I and *Sst*I sites of the binary vector pKM24KH (GenBank accession HM036220) carrying modified full-length transcript promoter (M24) of the *Mirabilis mosaic virus* (Dey and Maiti, 1999a,b) to generate the plasmid pKM24- SAC-Par-4-GFP. The pKM24-GFP construct was also generated and used as vector control. The physical map of different structural components and respective restriction sites of different constructs are represented in **Figure 1**.

#### Development of Transgenic Tobacco Plants

Ten independent T0 transgenic plant lines expressing pKM24- GFP and pKM24-SAC Par-4-GFP were raised following protocol described earlier (Kumar et al., 2011; Sahoo et al., 2014b; Patro et al., 2015). Subsequently, segregation analysis for T1 seeds from each independent line was performed following earlier protocol (Patro et al., 2015). Seeds from T1 plants showing appropriate segregation ratios (KanR:KanS <sup>=</sup> 3:1) were selected

1http://www.lifetechnologies.com/GeneArt/

and respective T2 transgenic plants were raised and maintained in green-house condition.

#### Green Fluorescent Protein (GFP) Assay

Total protein from leaves of 8-week-old transgenic pKM24- SAC-Par-4-GFP (L1–L10) and pKM24-GFP (control plant) was extracted following earlier protocol (Verwoerd et al., 1995; Kroumova et al., 2007) and quantified using Bradford (1976) method where BSA served as standard.

Fluorometric quantification of GFP was done following the earlier described protocol (Remans et al., 1999; Sahoo et al., 2014a). The GFP concentration (expressed in μg per mg protein) was measured in total leaf protein extracts from transgenic and vector control plants with the Turner Biosystems Luminometer employing the GFP-UV module using rGFP-S65T (Clontech) as standard. The results were expressed as means ± standard deviation of data from five different samples of the same line (three readings were taken per experiment).

#### GFP Visualization by Confocal Laser Scanning Microscopy

The fluorescence images of transgenic tobacco line L3 expressing SAC-Par-4-GFP were captured with Confocal Laser Scanning

fused in-frame with GFP (pKM24-SAC-Par-4-GFP). LB: left T-DNA border; RB: right T-DNA border; 5- AMV: a translational enhancer sequence; aTP: the apoplast targeting sequence of the *Arabidopsis* 2S2 protein; M24: recombinant full-length *Mirabilis mosaic virus* promoter, KanR: *neomycin phosphotransferase* II marker gene; rbcSE9: the 3- -terminator sequences (terminators) of the ribulose bisphosphate carboxylase small subunit and Nos PolyA: *nopaline synthase* genes are shown. The *EcoR*I, *Xho*I, *Nco*I, *Sal*I, *Sst*I, *Hind*III, *Bam*HI, *Xba*I, and *Cla*I restriction sites used to assemble these expression vectors are shown. The Southern and Northern hybridization probe (SAC-Par-4) is indicated by thick line in (B).

Microscope (TCS SP5; Leica Microsystems CMS GmbH, D-68165, Mannheim, Germany) using LAS AF (Leica Application Suite Advanced Fluorescence) 1.8.1 build 1390 software as described earlier by Kroumova et al. (2013). GFP expressed in transgenic plant was excited with an argon laser (30%) with AOTF for 488 nm (at 40%; Sahoo et al., 2009), and the fluorescence emissions were collected between 501 and 580 nm with the photomultiplier tube (PMT) detector gain set at 1150 V.

#### Integration Assays for *GFP*, *SAC-Par-4*, *npt*II, and *rbcSE*9

Genomic DNA (gDNA) from 3-week old T2–seedlings of 10 independent lines expressing pKM24-GFP and pKM24- SAC-Par-4-GFP individually were extracted following protocol of Allen et al. (2006). Integration assay for *GFP* gene was performed by PCR amplification of gDNA obtained from each line individually using GFP primer sets (**Table 1**). Likewise, integration of *SAC-Par-4*, *npt*II, and *rbcSE*9 genes were performed in selected transgenic lines L2, L3, L5, and L6 expressing pKM24-SAC-Par-4-GFP along with pKM24-GFP (VC) in presence of respective gene specific primer sets (**Table 1**). The PCR amplifications of the above-mentioned genes were carried out following the standardized protocol (Kumar et al., 2011).

#### Southern Blotting

Southern hybridization was carried out according to Sambrook et al. (1989). Briefly, 10 μg of gDNA isolated from four transgenic (L2, L3, L5, and L6 of SAC-Par-4-GFP) and vector control lines were digested overnight with *Xho*I (Fermentas) at 37◦C, subsequently electrophoresed on 0.8% TAE-agarose and transferred to a nylon membrane (Hybond-N+, Amersham). Blotted DNA fragments were hybridized with PCR-amplified <sup>P</sup>32-labeled *SAC-Par-4* probe using primers listed in **Table 1**.

#### TABLE 1 | List of Primers used.


Following hybridization at 65◦C for 16 h, the membrane was washed and subjected to autoradiography.

#### Transcript Analysis

Total RNA was extracted from 3-week old transgenic tobacco seedlings expressing pKM24-SAC-Par-4-GFP (L2, L3, L5, and L6) and vector control line pKM24-GFP using Spectrum Plant Total RNA kit (Sigma) following the manufacturer's instructions. Total RNA was subsequently treated with *DNase*I (Sigma) to obtain DNA free RNA.

Approximately, 10 μg of the above extracted RNA was subjected to Northern blot analysis as described earlier (Ranjan et al., 2012). Briefly, total RNA was electrophoresed on 1.2% (w/v) formaldehyde-agarose gel, blotted onto Hybond-N+ membrane (Amersham), subsequently hybridized with predenatured radiolabeled PCR-amplified P32-*SAC-Par-4* probe for 16 h at 60◦C, washed and autoradiographed.

The quantitative accumulation of *SAC-Par-4-GFP* transcripts in the above transgenic lines was performed using Real-time PCR following protocol reported earlier (Ranjan et al., 2012). Quantitative Real-time PCR was carried out using 5X HOT FIRE pol Eva Green qPCR Mix plus (ROX) employing Realtime PCR machine (MJ Research, Bio-Rad; Model; CFD-3220). The *SAC-Par-4* mRNA levels in different transgenic plants were expressed as fold excess in comparison to the mRNA level of *tubulin*. The fold difference in the transcript levels of *SAC-Par-4* in comparison to that of *tubulin* in different transgenic plants was presented as the mean of three independent biological replicates with respective standard deviation (SD).

#### Nuclei Isolation and Nuclear Run-on Assay

Nuclei from 4-week-old tobacco leaves (6 g) of different SAC-Par-4-GFP transgenic lines namely L2, L3, and L5 and wild tobacco plant were isolated and nuclear run-on assay was performed following the protocol described earlier (Folta and Kaufman, 2006). Briefly, 60 μg of the isolated nuclei from individual plants was incubated in nuclear transcription buffer containing 100 μCi of [α- 32P] UTP (6,000 Ci m/mol), 37.5 units of RNasin (Promega) and incubated at 30◦C for 30 min. Subsequently, the mixture was extracted by phenol: chloroform: isoamyl alcohol (25:24:1), purified and used as a probe for hybridization (approximately 2 × 10<sup>6</sup> cpm). PCR amplifications of *SAC-Par-4* gene, *npt*II gene, *18S* rRNA (positive control) were carried out using specific set of primers (**Table 1**). Approximately, 500 ng of PCR products along with pBSK plasmid (negative control) were bound with Hybond-N+ nylon membrane (GE Health Care Amersham HybondTM- XL) using dot-blot apparatus. Membrane was cut into four different strips and each strip was hybridized with heat denatured radiolabel transcripts obtained from each lines separately (SAC-Par-4-GFP lines L2, L3, L5, and wild tobacco plant) for 24 h at 65◦C. Membranes were washed and autoradiographed.

#### Enzyme Linked Immunosorbent Assay (ELISA) and Western Blot Analysis of Transgenic Plants

Total proteins from 6-week-old transgenic tobacco plant under lines L2, L3, L5, L6, and VC were extracted as described above (under GFP assay section). Approximately an aliquot of 5 μg of respective proteins were used to perform Enzyme linked immunosorbent assay (ELISA) following protocol described by Vazquez et al. (1996). The relative accumulation of SAC-Par-4 protein in transgenic lines was estimated according to Patro et al. (2015). Briefly, total protein samples obtained from transgenic seedlings expressing SAC-Par-4 and VC were coated on a 96-well microtiter plate and incubated with primary antibody specific to PAR-4 (R-334, Santa Cruz) for 2 h. Subsequently after two rounds of washing, horse-radish peroxidase conjugated secondary antibody was added. After incubation for 1 h, the plate was washed twice followed by addition of ortho-phenylenediamine substrate. The color development was measured at 492 nm in a microplate reader (Bio-Rad 3550) and converted as a percentage of total extracted protein by reference to an ELISA standard curve constructed with the bacterial purified SAC-Par-4 (described below).

Aliquots of 10 μg of total plant protein extracts obtained from L2, L3, L5, L6, and VC lines (as described earlier) were resolved on 12% SDS-PAGE, and electrophoretically transferred onto a 0.2-μm polyvinylidene difluoride (PVDF) membrane for Western blot analysis. The primary and secondary antibodies used are discussed above. The blot was visualized by ECL chemiluminescence (GE healthcare, UK). β-actin was used as a loading control.

#### Construction of Chimeric Gene Construct for Transient Assay

A plant expression vector, pKM24-SAC-Par-4-GFP-SEKDEL was constructed that targeted SAC-Par-4-GFP to the ER. Retention to the ER was achieved by adding an ER retrieval signal (SEKDEL) to the C-terminus of the apoplast-targeted SAC-Par-4-GFP construct. Description of the construct is similar to that pKM24-SAC-Par-4-GFP given in the "Materials and Methods" section. The physical map of the above construct is given in Supplementary Figure S1.

#### Agrobacterium-mediated Transient Expression Assays

*Agrobacterium tumefaciens* strain C58C1: pGV3850 was transformed with pKM24-SAC-Par-4-GFP-SEKDEL construct using freeze-thaw method. Agrobacterium harboring the above construct was used to infiltrate the leaves of 10–14-week-old *N. tabacum* following the protocol described earlier (Kapila et al., 1997; Yang et al., 2000). Seventy two hours post-infiltration, the agro-infiltrated tobacco leaves were processed for protein extraction. The concentration of SAC-Par-4-GFP-SEKDEL was quantified using a sandwich ELISA as described before. To confirm the integrity of transiently expressed SAC-Par-4-GFP-SEKDEL, Western blot analysis was performed on the protein extracts with anti-Par-4 antibody.

#### Deglycosylation Assay

Protein extracts obtained from transgenic plant (line L3) expressing SAC-Par-4-GFP and transiently expressed SAC-Par-4-GFP-SEKDEL were digested with the deglycosylating enzyme PNGase F, EndoH, and *O*-glycosidase (New England Biolabs; NEB) for 3 h at 37◦C, according to the manufacturer's instructions. Control samples were treated the same, except that no enzyme was added. To check the O-linked glycosylation status of SAC-Par-4-GFP and SAC-Par-4-GFP-SEKDEL, the plant extract was treated with Neuraminidase (NEB), and the proteins were denatured prior to deglycosylation treatment with O-glycosidase. All treated samples were then subjected to Western blot analysis using the anti-PAR-4 antibody.

#### Bacterial Purification of SAC-Par-4

The 59 amino acid (aa 137-195) long rat *SAC-Par-4* was cloned in pET-29b vector (Novagen) and expressed in *BL21* (DE3); purified through HisPurTM Cobalt resin (Thermo Scientific) as described earlier (Sahoo et al., 2014a). The purified SAC-Par-4 was dialyzed and concentrated with a centrifugal filter device (Amicon 10 KDa cut-off), and quantified by Bradford (Sigma). Fractions were collected and analyzed on 18% SDS-PAGE.

#### Proteolysis Assays by Trypsin

The protection assays of plant-derived SAC-Par-4-GFP (from transgenic line L3), SAC-Par-4-GFP-SEKDEL (transiently expressed) and SAC-Par-4 (bacterial purified) against trypsin digestion was performed according to Tremblay et al. (2011) and Wang et al. (2008). Aliquots of 10 μL protein extracts were removed at various time intervals and boiled for 10 min in 1x SDS-sample buffer, resolved on a 12% SDS-PAGE gel and subjected to Western blot analysis with the anti-PAR-4 antibody in the above assay.

#### Partial Purification of Plant-derived SAC-Par-4

Approximately, 50 g of leaves obtained from transgenic tobacco plant line L3 expressing SAC-Par-4-GFP were powdered using liquid nitrogen and homogenized in 150 ml of 1x PBS buffer supplemented by 5 mM EDTA, 1 mM PMSF and 1.5% PVP-40 and subjected to 20% ammonium sulfate cut. The supernatant obtained was dialyzed against 50 mM Tris-HCl (pH 8.0) buffer for 24 h at 4◦C and adjusted to 40% ammonium sulfate saturation; centrifuged at 30,000 × *g* for 40 min. The enriched SAC-Par-4- GFP fraction was obtained as pellet. The pellet was solubilized and dialyzed twice against water and once against antibody affinity column buffer consisting of 20 mM Tris-HCl pH 7.0 and 250 mM NaCl at 4◦C. Enriched SAC-Par-4-GFP was further purified according to the protocol of Downing et al. (2006). The insoluble materials were removed by centrifugation at 30,000 × *g* for 20 min at 4◦C and the supernatant was loaded onto Affi-Gel antibody affinity column (Bio-Rad) in a cold room. The polyclonal anti- Par-4 antibody (R-334; Santa Cruz) was combined with an equal volume of Affi-Gel 15 (Bio-Rad) and mixed at 4◦C for 2 h according to the manufacturer's instructions. Protein was eluted in antibody elution buffer (50 mM sodium citrate pH 4.0, 2 M NaCl) and extensively dialyzed in 1x PBS and concentrated with a centrifugal filter device (Amicon 10 KDa cut-off).

#### Cell Culture and Cell Lines

Hormone-independent PC3 (human) and MAT-LyLu (Rat) prostate cancer cell lines were cultured in RPMI (Pan Biotech) and DMEM (Pan Biotech) media, respectively. Hormonedependent LNCaP (human) cells were maintained in RPMI medium. Non-cancerous HEK293 (human embryonic kidney cells) was cultured in DMEM. All the media were supplemented with 10% heat-inactivated fetal bovine serum, 100 U/ml penicillin (Sigma) and 100 μg/ml streptomycin (Sigma). The cells were incubated at 37◦C in a humidified atmosphere of 95% air and 5% CO2.

#### Cell Viability Assay

Cell viability was determined by the colorimetric MTT (3-(4,5 dimethylthiazol-2-yl)-2,5- diphenyltetrazolium bromide) assay according to Acharya et al. (2014). Briefly, 2 <sup>×</sup> 103 cells (PC3, MAT-LyLu, LNCaP, and HEK293) were treated with purified SAC-Par-4-GFP and VC protein in different concentrations for 48 h. Subsequently MTT was added, incubated in dark for 30 min and the absorbance was measured at 570 nm using an ELISA reader (Synergy HT, BioTek Instruments Inc., Winooski, VT, USA). The cell-viability was expressed as percentage of control cells and average cell-viability was presented as a mean of three independent experiments with respective standard deviation. All the experiments were performed in triplicate.

#### AnnexinV-FITC and PI Staining

PC3 and HEK293 cells were treated with SAC-Par-4-GFP (50 ug/ml) and VC (60 ug/ml) protein separately for 48 h. After treatment, annexinV-FITC and PI staining was performed as per the protocol described in the kit (Alexa Fluor<sup>R</sup> 488 annexin V/Dead Cell Apoptosis Kit, Invitrogen). Stained cells were analyzed on BD LSRFortessaTM cell analyzer and the resulting fluorescence was taken by FLH-1 channel for green fluorescence and FLH-2 channel for red fluorescence. AnnexinV-FITC-positive and PI-negative cells were considered as early apoptotic cells (Q2) while those positive for both annexinV-FITC and PI were considered as late apoptotic cells (Q4).

#### Cell Cycle Analysis

SAC-Par-4-GFP and VC protein treated cells (mentioned above) were processed as per the protocol described in the kit (Tali<sup>R</sup> Cell Cycle Kit, Invitrogen). The cell cycle analysis was performed on the Tali<sup>R</sup> Image-Based Cytometer and results were reported as percentage of sub-G0 (sub-genomic DNA), G0/G1, S and G2/M cells. The percentage of total cells in Sub-G0 fraction was considered as apoptotic fraction.

#### Transduction of Luciferase Lentivirus and NF-**κ**B Reporter Assay

Stable NF-κB luciferase reporter cells (PC3-NF-κB-luc, MAT-LyLu-NF-κB-luc and HEK293- NF-κB-luc) were generated using ready-to-transduce replication incompetent lentiviral particles as suggested by Cignal Lenti NFκB Reporter (luc) Kit (Qiagen). Cells were transduced with lentiviral particles at the multiplicity of infection (MOI) 50 and puromycin (1 μg/ml for PC3, HEK293, and 8 μg/ml for MAT-LyLu) was used for selection of stably transduced cells (Jain et al., 2015). The aforementioned lentiviral particle related works were done with the prior permission of Institutional Biosafety Committee, Institute of Life Sciences, Bhubaneswar.

For luciferase assay, 1 <sup>×</sup> <sup>10</sup><sup>3</sup> cells were seeded in 96- well plate, grown for 24 h and treated with non-cytotoxic concentration of SAC-Par-4-GFP in the presence of TNF-α (10 ng/ml) for 6 h. Cells were lysed and luciferase expression was estimated using reporter assay kit (Promega, USA). The activity was normalized with protein concentration from each sample.

#### Detection of Cleaved Caspase-3 and Cleaved PARP by Western Blotting

PC3 cells were treated with SAC-Par-4-GFP at different concentrations for 48 h and Western blot was performed as described earlier (Plante et al., 2013). Briefly, 40 μg of protein from each sample were resolved on 10% SDS-PAGE, electrophoretically transferred onto a PVDF membrane and incubated with specific anti-caspase-3 (Cell Signaling) and anti-PARP antibody (Cell Signaling). Anti-rabbit-HRP-conjugated antibody (Santa Cruz) was used as secondary antibody and visualized by ECL chemiluminescence (GE healthcare, UK). β-actin was used as a loading control.

#### *In Vivo* Experiment

Animal experiments done in our study were conducted as per the animal ethics guidelines and were approved by animal ethics committee of the Institute of Life Sciences (ILS), Bhubaneswar. As we have used MAT-LyLu prostate cancer cell for our *in vitro* study, therefore, we adopt MAT-LyLu rat syngeneic prostate cancer model for our *in vivo* validation (Chang et al., 1993; Jiang et al., 2004). MAT-LyLu cells were harvested from subconfluent culture by a brief exposure to 0.25% trypsin and 0.02% EDTA. After neutralizing trypsin with 10% FBS, the cells were washed, counted and resuspended in PBS containing SAC-Par-4-GFP or VC protein at a concentration of 20 μg/100 μl. All the tubes had a concentration of 0.5 <sup>×</sup> 106 viable cells/100 <sup>μ</sup>l. Before injecting the cells into the animals, all the cells were incubated at room temperature for 30 min. To avoid potential bias, the tubes containing cells (treated with different proteins) were coded as T1 (SAC-Par-4-GFP) and T2 (VC) and injected by an individual unaware of the nature of samples. Aliquots of 100 <sup>μ</sup>l of cell suspension (0.5 <sup>×</sup> <sup>10</sup><sup>6</sup> cells) were injected subcutaneously into the flank region of six different male Copenhagen rats (*n* = 3 for each group). Measurement of the tumor volumes in each rat were conducted between 10th day and 18th day post-injection. Tumor volume was calculated by using formula *<sup>V</sup>* <sup>=</sup> (*W*<sup>2</sup> <sup>×</sup> *<sup>L</sup>*) ∗0.52 for caliper measurements, where *V* is tumor volume, *W* is tumor width, and *L* is the tumor length. On day 18 post injection, all the animals were sacrificed for evaluation. Following this, tissues were preserved in 10% formalin, embedded in paraffin, sectioned (5 μm in thickness) and subjected to hematoxylin and eosin (H&E) staining (Sahoo et al., 2008).

#### Statistical Analysis

Statistical analysis of all the data were performed adopting Student's *t*-test (using Graph Pad Prism version 5.01) and presented as a mean of three or four independent experiments. The *p*-value of less than 0.05 was considered significant.

#### Results

#### Design and Construction of *In Planta* Expression Cassette and Generation of SAC-Par-4-GFP Expressing Transgenic Tobacco Plants

We designed two primary plant expression cassettes namely pKM24-SAC-Par-4-GFP and pKM24-GFP for this study. The pKM24-GFP was used as vector control (VC) while the other directs the expression of *SAC-Par-4-GFP* in plants under the control of the modified full-length M24 transcript promoter of the *Mirabilis mosaic virus* (**Figure 1**). The pKM24-GFP and pKM24-SAC-Par-4-GFP were used to generate transgenic plants (*Nicotiana tabacum* cv. Samsun NN).

#### Analysis of Transgenic Tobacco Plants

*GFP*-integration in all 10 independent pKM24-GFP and pKM24-SAC-Par-4-GFP T2 transgenic plants was confirmed by *GFP* (250 bp) amplification using PCR as described earlier (**Figure 2A**). Further, we detected enhanced GFP protein accumulation in the transgenic lines L1, L2, L3, L6, L7, and L10 (**Figure 2B**). Independent transgenic lines L2, L3, L5, and L6 were selected for further analysis. In this study, wild-type plants and the vector control-GFP plants were used as controls.

Furthermore, we confirmed the integration of 183 bp long *SAC-Par-4* in L2, L3, L5, and L6 transgenic pKM24-SAC-Par-4-GFP plant but not in pKM24-GFP (VC) plant (**Figure 3A**). We also demonstrated integrations of different components of expression cassette by PCR- amplification of *rbcSE9* and *npt*II in the above plant lines (**Figure 3A**).

For Southern blot analysis, the genomic DNA from transgenic lines L2, L3, L5, L6, and VC was digested with the restriction enzyme *Xho*I. The selection of this enzyme is based on the fact that it digests the genomic DNA with reasonable frequency, but acts as a single cutter for transgene construct (not inside the *SAC-Par-4* gene; **Figure 1B**). Southern analysis confirmed integration of the *SAC-Par-4* transgene in the genome of L2, L3, L5, and L6 transgenic lines; we observed higher copy number for L2 line in comparison to the other lines. Overall, appearance of discrete and alike Southern positive bands (from different transgenic lines) suggests that integration of *SAC-Par-4* is in tandem fashion in plant genome, indicating independent nature of transgenic events (**Figure 3B**).

Alongside, Northern analysis, as describes earlier, yielded a fair signal for the presence of *SAC-Par-4* transcripts in L2, L3, L5, and L6 lines harboring pKM24-SAC-Par-4-GFP, while no signal was detected in pKM24-GFP plant. We observed variations in signal intensities of Northern bands among different lines (**Figure 3C**); with highest expression in line L3. This observation support the data obtained from Real-time RT-PCR experiment. We observed expression of *SAC-Par-4-GFP* transcripts in following order: L3 <sup>&</sup>gt; L5 <sup>&</sup>gt; L2 <sup>&</sup>gt; L6 (**Figure 3D**).

We performed nuclear run-on assays using intact nuclei isolated from L2, L3, and L5 lines to measure the accumulation of *SAC-Par-4-GFP* transcripts *in vivo* and to analyze potential differences in the transcription initiation between the transgenic lines. Data obtained revealed that *SAC-Par-4*, *npt*II, and *18S* genes were expressed in above lines, with the strongest transcription rate found in L3 line which is supporting the result obtained from Northern blot and Real-time assays (**Figure 3E**).

#### ELISA and Western Blot Analysis of Plant-derived SAC-Par-4-GFP

The accumulation level of recombinant SAC-Par-4-GFP in different transgenic lines (L2, L3, L5, and L6) was evaluated by ELISA as described earlier. We noticed SAC-Par-4-GFP accumulation levels varied among independent transgenic plants (T2) ranging from 0.05% to 0.15% total soluble protein or 1.6–10.5 μg /g fresh leaf weight (results not shown in detail). The L3 line displayed maximum recombinant SAC-Par-4- GFP accumulation (0.15%) as compared to L2, L5, and L6 (**Figure 4A**). The expression of ER-targeted SAC-Par-4-GFP-SEKDEL in infiltrated leaves reached the highest level at day 3 post-infiltration (dpi), with yields upto 88 μg/g fresh weight.

In continuation, we carried out Western blot analysis to check the integrity of the plant-derived SAC-Par-4-GFP as described earlier and observed a band of around 32 kDa SAC-Par-4-GFP protein on the blot from L2, L3, L5, and L6 tobacco plants (**Figure 4B**).

#### Deglycosylation Analysis of Plant-derived SAC-Par-4-GFP

We performed deglycosylation assays on the protein extracts obtained from transgenic line L3 expressing SAC-Par-4-GFP and found that the apoplast-targeted SAC-Par-4-GFP was fully resistant to endoglycosidase H (EndoH) digestion (**Figure 5A**, lanes 1 and 2) and peptide *N*-glycosidase F (PNGaseF) treatment (**Figure 5A**, lanes 3 and 4), suggesting that SAC-Par-4-GFP does not have any glycan molecules sensitive to both of these enzymes. In parallel, the efficacies of these enzymes (Endo H and PNGase F) were validated by digestion of a different plant- derived protein namely human peroxisomal -2,-3-enoyl CoA isomerase protein HsPECI2 (Rai et al; unpublished data from our group). Furthermore, enzymatic treatment with α-neuraminidase and *O*-glycosidase followed by Western blot analysis with anti-PAR-4 polyclonal antibody displayed no band shift for apoplast-targeted SAC-Par-4-GFP protein extract demonstrating that SAC-Par-4- GFP was resistant to *<sup>O</sup>*-glycosidase treatment (**Figure 5B**). Taken together, these results indicate plant-derived apoplast-targeted SAC-Par-4-GFP is not glycosylated.

Furthermore, deglycosylation analysis was performed with the total protein extracts obtained from agroinfiltrated tobacco leaves expressing ER-targeted SAC-Par-4-GFP-SEKDEL as described in the Section "Material and Methods." Data obtained revealed plant-derived SAC-Par-4-GFP-SEKDEL was resistant to both PNGase F and Endo H (**Figure 5C**). Interestingly, *<sup>O</sup>*-glycosidase treatment followed by Western blot analysis displayed a new smaller band with a molecular mass of approximately 32 kDa in *O*-glycosidase treated SAC-Par-4-GFP-SEKDEL protein extract

compared to the untreated sample which migrated around 34 kDa (**Figure 5D**). Hence, the SEKDEL-tagged SAC-Par-4-GFP proteins were sensitive to *O*-glycosidase, suggesting presence of *O*-linked glycans indicative of ER localization with efficient retention or retrieval from the *cis*-Golgi.

#### Proteolytic Stability of Plant-derived SAC-Par-4-GFP

Trypsin digestions were carried out with total protein extract obtained from plant-derived SAC-Par-4-GFP, SEKDEL-tagged SAC-Par-4-GFP and bacterial purified SAC-Par-4 as described in Section "Material and Methods" (**Figure 6** and Supplementary Figure S2). To evaluate the resistance or sensitivity of these proteins toward trypsin digestion, Western blot analysis was performed with anti-Par-4 antibody by taking various aliquots of the trypsin-treated protein extracts at different time points (0, 1 min, 5 min, 15 min, and 30 min). The applied trypsin ratio to recombinant proteins was 40:1 and 10:1 (w/w) for plant extract and bacterial purified protein, respectively; Result of **Figure 6** confirmed that plant-derived SAC-Par-4-GFP was resistant to trypsin digestion even after 15 min post-treatment while *Esc*herichia *coli* derived SAC-Par-4 got degraded within 10 min of incubation. Furthermore, approximately 90% of SAC-Par-4-GFP-SEKDEL remained intact after 30 min of incubation (Supplementary Figure S2). The enhanced stability of ER-targeted SAC-Par-4-GFP-SEKDEL could be attributed to the fact that this protein is getting effectively glycosylated. This result demonstrated that plantderived SAC-Par-4-GFP showed enhanced protection to trypsin digestion.

#### Purification of Plant-derived SAC-Par-4-GFP Protein

We partially purified SAC-Par-4-GFP from transgenic line L3 by antibody pull-down approach, as described in "Material and Methods" section. The enrichment coupled with affinity chromatography process exhibited 50% recovery and above 70% purification of the SAC-Par-4-GFP as confirmed by SDS-PAGE and Western blot (**Figure 7**).

#### Plant-derived SAC-Par-4-GFP Inhibited Cell Growth, Induced Apoptosis *In Vitro* and Delayed the Onset of Tumor Growth *In Vivo*

The biological activity of plant-derived partially purified SAC-Par-4-GFP from L3 line was determined by MTT assay on PC3, MAT-LyLu, and LNCaP cell lines at different concentrations (10, 20, 30, 40, 50, and 60 μg/ml) for 48 h as described earlier. The viability of PC3 and MAT-LyLu cells were reduced

showing *rRNA* quality; (D) Real-time analysis of *SAC-Par-4* transcripts in VC, L2, L3, L5, and L6 lines. The data were normalized by *tubulin* transcripts. The data shown are mean values from three independent experiments. Bars indicate the standard errors of means; (E) Nuclear run-on assay of *SAC-Par-4* transcripts. Dot

blots were hybridized to 32P-labeled nascent transcripts from wild-type plant and transgenic lines L2, L3, and L5 which were synthesized by run-on transcription. pBSK plasmid (1 μg) and 18S ribosomal DNA (0.5 μg) were used as controls.

with increasing concentration of SAC-Par-4-GFP protein, and maximum cell death was observed up to 63 and 66%, respectively, at 60 <sup>μ</sup>g/ml concentrations (**Figure 8A**). Further, the IC50 values were found to be 43 μg/ml and 50 μg/ml for PC3 and MAT-LyLu cells, respectively. However SAC-Par-4-GFP exhibited maximum 10–15% growth inhibition in LNCaP cells. On the other hand, the protein obtained from pKM24-GFP (VC) showed no cytotoxic effect at 100 μg/ml concentration. In parallel, we observed plant-derived SAC-Par-4-GFP exhibited no cytotoxicity on HEK293 cell line, which is non-cancerous in nature (**Figure 8A**).

We performed AnnexinV-FITC/PI staining to investigate whether SAC-Par-4 induces apoptosis in the PC3 cancer cell and non-cancerous HEK293. As shown in **Figure 8B**, treatment with 50 μg/ml of SAC-Par-4-GFP for 48 h induced apoptosis in PC3 cells and increased the percentage of early and late apoptotic cells by approximately fourfold compared to the vector control. At the same time, in case of HEK293 cell line, there was no significant difference between the total percentages of apoptotic cells in vector control and SAC-Par-4-GFP treated cells (data not shown). Further cell cycle analysis in PC3 cells treated with SAC-Par-4-GFP protein (50 μg/ml) for 48 h revealed significant increase in sub-G1 population (sub-diploid DNA fraction) by 4.6 fold compared to the vector control and this population represented the apoptotic portion, implying SAC-Par-4 promote apoptosis (**Figure 8C**). In the same direction, we observed signals for apoptotic markers like cleaved caspase-3 and cleaved PARP in PC3 cells treated with 50 μg/ml of purified SAC-Par-4-GFP for 48 h (**Figure 8D**).

Suppression of NF-κB activity by SAC-Par-4 in PC3-NF-κBluc and MAT-LyLu-NF-κB-luc cells was studied as described earlier. NF-κB luciferase activity was significantly increased in PC3 cells by approximately twofold and in MAT-LyLu cells by approximately fourfold in TNF-α-stimulated cells compared with

the untreated control group. However, the TNF-α-stimulated luciferase activity was significantly reduced (*p* < 0.05) to approximately twofold by 30 μg/ml of SAC-Par-4 treatment in both cell lines. In parallel, there was no suppression of NF-κB activity observed in case of TNF-α-stimulated HEK293- NFκB-luc cells upon treatment with 30 μg/ml of SAC-Par-4-GFP (**Figure 8E**).

In the *in vivo* study, we have examined the effect of plant-derived affinity-purified SAC-Par-4 on tumor incidence in a syngenic rat prostate cancer model. The result from this investigation indicated that SAC-Par-4-GFP pre-treatment and co-injection with the MAT-LyLu cells inhibited the rate of tumor growth *in vivo* as compared to control group. Visible tumors appeared at sixth day after the injection of cells pre-treated with pKM24-GFP vector control protein. Whereas, there was no visible tumor up to 14th day in animals injected with SAC-Par-4-GFP pre-treated cells. Moreover, tumor volume measured after the 10th day of cancer cells and vector control protein injection showed exponential growth of all the tumors of the control animals. Further the hematoxylin and eosin (H&E) staining confirmed the presence of viable cancer cells in all tumor tissues; alongside the staining ruled out the possibility of non-tumor (necrotic) mass (**Figure 8F**).

#### Discussion

There are several definite advantages of using plants as bioreactor for large-scale production of recombinant biologics at cheaper rate and they are capable of imparting posttranslational modification in synthesized protein. We raised transgenic tobacco plants expressing codon optimized rat *SAC-Par-4-GFP* gene under the control of strong constitutive M24 promoter; a constitutive promoter with 25-fold stronger activity

than the CaMV35S promoter (Dey and Maiti, 1999a,b). We deliberately fused a translational enhancer sequence (5- AMV) and apoplast targeting sequence (aTP) of *Arabidopsis* 2S2 protein gene to ensure stability and enhanced production of SAC-Par-4 (Schillberg et al., 1999; Benchabane et al., 2008), (**Figure 1**). Additionally, for the transient assay we coupled the SEKDEL signal sequence to the C-terminus of the SAC-Par-4-GFP to target the recombinant SAC-Par-4-GFP to the ER lumen for proper post-translational modifications, such as glycosylation and disulphide bond formation (Supplementary Figure S1; Hwang et al., 1992; Ma et al., 2005b; Wang et al., 2008).

Rate of seed germination, segregation analysis (**Table 2**), molecular characterization, gene integration and expression analysis of SAC-Par-4-GFP transgenic lines were performed (**Figures 2–4**); based upon the GFP content in the above lines, we have selected three lines (L2, L3, and L6) with higher GFP accumulation and one with low GFP content L5 for further study. The higher accumulation of plant-derived SAC-Par-4- GFP may be attributed to several factors acting in synergy, including the stable nature of the protein itself, the use of a strong M24 promoter coupled with a 5- AMV–2S2 signal peptide sequence. We found a strong expression of SAC-Par-4-GFP both at transcript and protein levels in L3 transgenic plant, while moderate to weak expression in L2, L5, and L6 transgenic plants, respectively. Even though, we noted higher copy number shown by signal obtained in Southern hybridization for L2 line compared to L3 line, the integration at the heterochromatin region of the tobacco genome might implicate the silencing of *SAC-Par-4-GFP* in L2 line as reported earlier by Xu-Gang et al. (2002). Furthermore, we found strong GFP expression at the periphery of guard cells and trichome stalk cell of apoplast targeted SAC-Par-4-GFP in transgenic line L3 (Supplementary Figure S3). In lieu of these observations, we finally selected L3 line for further Affi-gel Par-4 based affinity purification of SAC-Par-4-GFP.

The major advantage of using plant as a bioreactor is that it glycosylates the recombinant protein along the secretory pathway as proteins move from the ER through the Golgi to their final destination. Therefore, the nature of

equivalent of approximately 500 ng total protein. The position of the molecular weight marker (M) is indicated.

glycosylation and proteolytic stability of the apoplast-targeted SAC-Par-4-GFP and ER-targeted SAC-Par-4-GFP-SEKDEL protein were analyzed (**Figure 5**). Glycosylation enhances the physiochemical properties of a protein by endeavoring thermal resistance, protection from proteolytic degradation and stability (Ma et al., 2005c). To attest this preposition, we targeted SAC-Par-4-GFP protein to the secretory pathway using apoplast signal sequence and ER-specific SEKDEL sequence and performed deglycosylation assays as described previously. Our results indicated that the plant-derived SAC-Par-4-GFP may contain complex-type N and *O*-glycans which were inaccessible to the deglycosylating enzymes Endo H, PNGaseF, and *O*-glycosidase which is in accordance to a previous report for a different plant-derived protein namely human erythropoietin, EPO (Conley et al., 2009). In contrast, the SEKDEL-tagged SAC-Par-4-GFP proteins were sensitive to *<sup>O</sup>*-glycosidase (**Figure 5D**) unlike the apoplast-targeted SAC-Par-4-GFP (**Figure 5B**), suggesting presence of *O*-linked glycans in ER-targeted SAC-Par-4-GFP-SEKDEL. Non-glycosylated plant-derived SAC-Par-4-GFP had almost similar molecular mass (∼32 kDa) to that of deglycosylated SAC-Par-4-GFP-SEKDEL (∼32 kDa) and had lower molecular weight to that of the glycosylated SAC-Par-4-GFP-SEKDEL (∼34 kDa). This is most likely because of the addition of the ER retention signal motif SEKDEL to the C-terminus of plant-derived SAC-Par-4-GFP along with *O*-linked oligosaccharide chain (Conley et al., 2009). *In silico* analysis of SAC-Par-4-GFP-SEKDEL using NetOGlyc 4.0.0.13 software2 affirmed two potential *O*-glycosylation sites at positions 20 and 21 located in SAC-Par-4-GFP. Over and above the recombinant SAC-Par-4-GFP and SAC-Par-4-GFP-SEKDEL protein showed resistance to trypsin digestion for 15–30 min compared to bacterial derived SAC-Par-4 (**Figure 6** and Supplementary Figure S2). This could be explained by the fact that plant-derived SAC-Par-4 in fusion with another protein, GFP, is protected against trypsin digestion as compared to that of bacterial SAC-Par-4. Besides this, effective glycosylation of SAC-Par-4-GFP-SEKDEL protein and its enhanced stability could be the other reasons for increasing the therapeutic value of plant derived SAC-Par-4-GFP-SEKDEL protein. To achieve higher and more stable expression, we have begun to produce transgenic tobacco lines homozygous for the introduced *SAC-Par-4-GFP-SEKDEL* gene.

The cytotoxic effect of affinity-purified SAC-Par-4 on PC3 and MAT-LyLu cells indicates that plant-derived SAC-Par-4 retains its biological activity (**Figure 8A**). Importantly, its effect on both the cancer cells but not on HEK293 cells suggests its specificity against prostate cancer. Furthermore, the precise role of plant-derived SAC-Par-4 as a potent anti-prostate cancer agent needs to be evaluated in reference to its activity against normal prostate cell lines such as PNT1 A/B or PNT2. An earlier report (Burikhanov et al., 2009) indicated that normal/immortalized cells fail to respond to exogenous SAC-Par-4 due to their lower content of specific cell surface receptor GRP78 in addition to lack of robust ER-stress response. Moreover, human embryonic kidney cell line HEK293 has been used as a representative normal cell line in multiple studies (Imberg-Kazdan et al., 2013; Zhang et al., 2014; Al-Sheddi et al., 2015; Kim et al., 2015). HEK293 cells being immortalized and non-cancerous in nature, have lower expression levels of GRP78 in comparison to cancer cells which justifies its inclusion as appropriate control in this study (Li et al., 2008; Burikhanov et al., 2009; Dai et al., 2010). Apart from evaluating the inhibitory activity of

<sup>2</sup>http://www.cbs.dtu.dk/services/NetOGlyc/

<sup>±</sup>SD, *<sup>n</sup>* <sup>=</sup> 3, <sup>∗</sup>*<sup>p</sup>* <sup>&</sup>lt; 0.05, ∗∗*<sup>p</sup>* <sup>&</sup>lt; 0.005 and ∗∗∗*<sup>p</sup>* <sup>&</sup>lt; 0.001; (B) PC3 (1 <sup>×</sup> 106 cells) were treated with vector control protein (100 <sup>μ</sup>g/ml) and partially purified SAC-Par-4-GFP protein (50 μg/ml), and the percentage of apoptotic cells was analyzed by AnnexinV-FITC/PI staining. Experiment was repeated two times and quantitative result (Q2 + Q4) is displayed as the mean fold change (±SD) compared with control from two independent experiments; (C) Cell cycle analysis of PC3 cells treated with vector control protein (100 μg/ml) and SAC-Par-4-GFP protein (50 μg/ml). Data are representative of three independent cell cycle analyses and expressed as a percentages of cells found in different cell population, sub-G1 (apoptotic cells), G0/G1, S, and G2/M; (D) Immunoblot analysis demonstrates cleaved caspase-3 and cleaved PARP expression in PC3 cells treated with different concentration of SAC-Par-4-GFP protein for 48 h; (E) PC3-NF-κB-luc, MAT-LyLu-NF-κB-luc, and HEK293-NF-κB-luc reporter cells were treated with TNFα (10 ng/ml) or SAC-Par-4-GFP protein in concentrations of 10, 20, and 30 μg/ml for 6 h. The results were represented as change in relative luciferase activity, ±SD of three independent experiments and <sup>∗</sup>*p* < 0.05; (F) MAT-LyLu cells pre-treated and co-injected with SAC-Par-4-GFP protein or vector control protein (20 μg in 100 μl cell suspension) were injected into the flank region of male Copenhagen rats (*n* = 3 for each group). After 10 days, tumor volumes were measured every day and plotted. Points in graphics are volume mean values and standard deviations. H&E staining confirmed presence of viable tumor cells in all the isolated tumor tissues.

#### Sarkar et al. Functional characterization of plant-derived SAC-Par-4

#### TABLE 2 | Segregation Analyses.


plant-derived SAC-Par-4 in androgen receptor negative cell lines like PC3 and MAT-LyLu, we also studied SAC/Par-4 effect in an AR positive prostate cell line (LNCaP). Plant-derived SAC-Par-4 showed very low inhibitory effect (10–15% sensitive) in LNCaP cells. This effect is not significantly higher than control treatment, which indicates that they could be resistant or less responsive to induction of apoptosis by plant-made SAC-Par-4. This is consistent with the findings of Chakraborty et al. (2001) where the authors described that *Par-4* is able to induce apoptosis in PC3, DU-145, and TSU-Pr cells, while LNCaP are resistant. Furthermore, the mechanism underlying the differential cytotoxic effect of plant-derived SAC-Par-4 on AR positive and AR negative human prostate cancer cells needs further investigation. The cytotoxic effect of this protein on both human (PC3) and rat prostate cancer cell lines (MAT-LyLu) further suggests its efficacy on cancer cells of different origin, and this might be due to the highly conserved SAC domain of Par-4 (El-Guendy and Rangnekar, 2003). A similar kind of effect of Par-4 on cancer cells of multi-species origin have also been reported earlier (Vetterkind et al., 2005; Shukla et al., 2013). To rule out the effect of other proteins derived from the different components of the vector and other soluble proteins of the plant, we compared the result obtained by purified SAC-Par-4 protein with the protein obtained from vector control plant. Our observations from Annexin-V/PI staining, cell cycle analysis and Western blotting in PC3 cell line (**Figures 8B–D**) confirms that plant-derived SAC-Par-4 is able to specifically kill cancer cell via apoptosis similar to that observed earlier (Burikhanov et al., 2009). Moreover, the NF-κb luciferase assay (**Figure 8E**) clearly demonstrates that plant-derived SAC-Par-4 is able to significantly suppress the NF-κb activity in both PC3 and MAT-LyLu cells which is in corroboration to the previous findings (El-Guendy et al., 2003; Zhao and Rangnekar, 2008).

In the present study, we checked the anti-tumor activity of plant-derived partially purified SAC-Par-4-GFP by injecting MAT-LyLu cells pre-treated and co-injected with SAC-Par-4- GFP or VC protein. We observed visible tumors in case of SAC-Par-4-GFP pre-treated cells on the 15th day of cells' injection (*n* = 2/3). However, all the animals injected with VC pretreated cells had a visible tumor by sixth day of cancer cells injection (*<sup>n</sup>* <sup>=</sup> 3/3; **Figure 8F**). The delayed onset in the growth of tumor in case of SAC-Par-4-GFP pre-treated MAT-LyLu cells might be due to the fact that pre-incubation with SAC-Par-4-GFP induced cell-death and at the same time the cells that escape the cytotoxic effect repopulated to make a visible tumor at the later time point. Although the animal number in this study is not so high but we noted a clear visible difference in the tumor volume developed between SAC-Par-4-GFP treated and VC protein treated rat groups. A similar animal cohort (*n* = 3) has also been used by other investigators in literature (Moore et al., 2001; Liu et al., 2014). These observations have encouraged us to check the efficacy of SAC-Par-4 protein in large animal cohort as a future prospect of this project. In this regard we have planned to exploit MAT-LyLu cells labeled with luciferase and implanted orthotopically in future.

*In vitro* and *in vivo* results of our present findings may be considered as a starting point for ascertaining the antiprostate cancer activity of plant-derived recombinant SAC-Par-4. In order to effectively demonstrate the generalized cancer-specific efficacy of plant-isolated SAC-Par-4, more and different robust experiments are required to be performed. These include evaluation and comparative analysis of anti-tumorigenic activities of plant-derived SAC-Par-4 in several AR positive cell lines (LNCaP, LAPC4, and MDA PCa 2b), AR negative cell lines (PC3, MAT-LyLu, and DU145), normal prostate cell line (PNT1-A/B, PNT2PrE, and PrS), immortalized human prostate epithelial cells (PZHPV7) along with detailed studies in a large cohort of animals. Such vigorous approaches will be required for further establishing the anti-cancer effect of plant-derived SAC-Par-4 and also to investigate its possible undesired side-effects on normal cells.

Taken together, in the current study we have engineered plant for efficient production and isolation of plant-derived SAC-Par-4-GFP protein. Importantly, this protein is biologically active and able to reduce the growth of the prostate cancer cell lines (PC3 and MAT-LyLu). This cytotoxic effect is potentially through NF-κb suppression and induction of apoptosis. Our *in vitro* and *in vivo* studies suggest the potential of plantderived SAC-Par-4-GFP as a therapeutic agent against prostate cancer cells; however, further extensive studies are essential to evaluate the efficacy of this protein in a more clinically relevant model.

#### Acknowledgments

We are very much grateful to Kentucky Tobacco Research and Development Center (KTRDC) for facilities and support. This work was supported by the KY state KTRDC grant to IBM and ILS/Core fund to ND. This study was also supported in part by NIH R01 CA187273, and R21 CA179283 to VMR. SS and SJ are thankful to University Grant Commission, New Delhi, India for their Ph.D. fellowship. The authors would like to thank Ms.

#### References


Bonnie Kinney for her excellent care of the experimental tobacco plants and Mr. Madan Mohan Mallick, ILS for helping in the animal tissue processing.

#### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015.00822


factor-related apoptosis-inducing ligand through induction of reactive oxygen species in prostate cancer cells. *Prostate Cancer Prostatic Dis.* 16, 16–22. doi: 10.1038/pcan.2012.37


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Sarkar, Jain, Rai, Sahoo, Raha, Suklabaidya, Senapati, Rangnekar, Maiti and Dey. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Commentary: Extracellular peptidase hunting for improvement of protein production in plant cells and roots

Karl J. Kunert\* and Priyen Pillay

*Molecular Plant Physiology Group, Plant Science Department, Forestry and Agricultural Biotechnology Institute, University of Pretoria, Pretoria, South Africa*

Keywords: proteases, recombinant protein production, protein stability, in silico analysis, molecular pharming

#### **A commentary on**

**Extracellular peptidase hunting for improvement of protein production in plant cells and roots** by Lallemand, J., Bouché, F., Desiron, C., Stautemas, J., de Lemos Esteves, F., Périlleux, C., et al. (2015). Front. Plant Sci. 6:37. doi: 10.3389/fpls.2015.00037

Despite much recent success in plant-based protein production, key challenges, such as undesired plant proteolytic activities, still severely compromises current recombinant protein production with peptidases affecting protein stability (Pillay et al., 2013). The paper by Lallemand et al. (2015) reporting about identification of extracellular peptidases compromising protein production in plant cells and roots is therefore an excellent contribution to ultimately advance our understanding of peptidase action in plant-based recombinant protein production (Lallemand et al., 2015). Since research has so far not paid a great amount of attention to this problem, a more detailed view, as taken in the paper, is highly beneficial to elucidate such peptidases in the extracellular space. This offers great benefits in terms of protein stability and higher protein production yield.

Previous approaches used to address this challenge in plants has for example included peptidase silencing by applying RNA interference technology (Voinnet et al., 2003; Hatsugai et al., 2004) and also co-expressing specific protease inhibitors as "companions" to limit specific protease activities (Goulet et al., 2010, 2012; Pillay et al., 2012). However, silencing a specific peptidase or co-expressing a "companion" protease inhibitor always bears the risk of vital plant metabolic pathways also being affected (Van der Vyver et al., 2003; Senthil-Kumar et al., 2007). This can compromise efficient recombinant protein production in a plant-based system. In addition, work on Arabidopsis, as already done by Lallemand et al. (2015), with its existing wealth of transcriptome and gene data (The\_Arabidopsis\_Genome\_Initiative, 2000) will enable future identification of similar peptidases in other plant species when comparative genomics approaches are applied in combination with Next Generation Sequencing.

By investigating two plant species (Arabidopsis thaliana and Nicotiana tabacum); the Lallemand et al. (2015) study particularly unraveled that root-secretion production contained more peptidase activity than, for example, the extracellular medium of cell suspensions. A less proteolytic enriched environment is certainly more favorable for the production of recombinant proteins, especially antibodies. This key finding has, therefore, not only significantly extended our understanding how particular plant species contribute to proteolytic activity and type of peptidase produced but has also contributed to advancing our understanding on how proteases in different plant parts can compromise recombinant protein stability. The study has whereby set a strong working basis for exploring, in the future, proteolytic action in greater depth.

#### Edited by:

*Kazuhito Fujiyama, Osaka University, Japan*

#### Reviewed by: *Stefan Schillberg, Fraunhofer IME, Germany*

\*Correspondence: *Karl J. Kunert, karl.kunert@up.ac.za*

#### Specialty section:

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

Received: *11 May 2015* Accepted: *07 July 2015* Published: *21 July 2015*

#### Citation:

*Kunert KJ and Pillay P (2015) Commentary: Extracellular peptidase hunting for improvement of protein production in plant cells and roots. Front. Plant Sci. 6:557. doi: 10.3389/fpls.2015.00557*

Lallemand et al. (2015) also focused on establishing genotranscriptome data. By also tapping into the wealth of existing peptidase data, Lallemand et al. (2015) further carried out an in-depth in silico analysis of existing Arabidopsis genome and transcriptome data. Remarkably, the search resulted in identification of serine and metallo-peptidases as main peptidases involved in proteolytic processes. These peptidases were consistently expressed in the two investigated production systems. By applying the approach of merging activity assays with geno-transcriptome data, specific Ser-peptidases, potentially responsible for target degradations, were identified. Lallemand et al. (2015) proposed that these peptidases should first be prime candidates for modification to improve protein stability.

Specific inhibition of Ser-proteases is certainly an attractive idea which is also supported by previous findings (Goulet et al., 2012). However, the question still remains, how many other proteases are there particularly in plants currently applied in recombinant protein production and what role(s) do they play in protein production and stability. For example commercial companies are primarily using Nicotiana benthamiana and also the unconventional method of producing proteins in carrot cells is applied. These plant species might have very different protease profiles. Investigating such systems for protein production from a plant-based perspective, suggests commercial preferences in industry which are excellent indicators for researchers to adopt in their methodology. Consequently, more definitive investigations are required in protease profiling with the option to avoid plant species with a specific profile unfavorable for the production of a specific recombinant protein. In this regard, recent Next generation sequencing and also proteomics approaches for protease profiling (Vandenabeele et al., 2003; van Wyk et al., 2014) will allow the identification of a great number of peptidases as well as the establishment of their particular expression profiles in plant species targeted for recombinant protein production. In addition, more focused assessments in recombinant protein susceptibility to proteases have to be carried out to identify potential cleavage sites within the protein. These considerations and risks are encapsulated in our pipeline for enhancing protein expression (**Figure 1**) which illustrates two stages where proteins are most vulnerable to proteolysis. A different complement of plant-derived proteases may be released during the extraction process from a cellular compartment that is different to that where the target protein is originally localized and thus may also co-purified during the purification process. Once the inherent susceptibility of the target protein is determined, appropriate inhibitors can be used to ameliorate the negative effects of proteases during extraction and purification.

Without doubt, the study is, as Lallemand et al. (2015) have already outlined, an excellent starting point to develop new strategies for identifying proteolytic activity with the goal of enhancing recombinant protein stability.

#### References


#### Funding

This work was supported by National Research Foundation (NRF) as NRF incentive funding to KK and a NRF bursary to PP.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Kunert and Pillay. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.