# VERSATILE ROLES OF ORGANELLE OUTER MEMBRANES IN INTRACELLULAR COMMUNICATION

EDITED BY: Kentaro Inoue PUBLISHED IN: Frontiers in Plant Science

#### *Frontiers Copyright Statement*

*© Copyright 2007-2015 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-604-3 DOI 10.3389/978-2-88919-604-3

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **VERSATILE ROLES OF ORGANELLE OUTER MEMBRANES IN INTRA-CELLULAR COMMUNICATION**

Topic Editor: **Kentaro Inoue,** University of California at Davis, USA

This topic covers emerging knowledge about the properties and functions of the outer membranes of chloroplasts and mitochondria. These outer membranes house various processes necessary for efficient communication and thus integration of the organelles with and into their surroundings in the cytoplasm. Such processes include, but are not limited to, protein import, organelle division, organelle movement, metabolism, and metabolite/ion transport. Recent molecular genetic, biochemical and cell biological studies have revealed functions of various outer membrane proteins. These findings have helped address and generate diverse biological and evolutionary questions at molecular, cellular and whole organism levels. The topic should encourage contributions of scientists from various disciplines and thus would provide the field with opportunities to "think outside the box" and to develop potential collaborations. The topic is also aimed to stimulate interests of general audience in the outer membranes of chloroplasts and mitochondria.

**Citation:** Inoue, K., ed. (2015). Versatile Roles of Organelle Outer Membranes in Intracellular Communication. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-604-3

# Table of Contents


Naomi J. Marty, Howard J. Teresinski, Yeen Ting Hwang, Eric A. Clendening, Satinder K. Gidda, Elwira Sliwinska, Daiyuan Zhang, Ján A. Miernyk, Glauber C. Brito, David W. Andrews, John M. Dyer and Robert T. Mullen

*41 Border control: selectivity of chloroplast protein import and regulation at the TOC-complex*

Emilie Demarsy, Ashok M. Lakshmanan and Felix Kessler

*51 Targeting and assembly of components of the TOC protein import complex at the chloroplast outer envelope membrane* Lynn G. L. Richardson, Yamuna D. Paila, Steven R. Siman, Yi Chen, Matthew D. Smith and Danny J. Schnell

*65 A new member of the psToc159 family contributes to distinct protein targeting pathways in pea chloroplasts*

WaiLing Chang, Jürgen Soll and Bettina Bölter


Allison K. Strohm, Greg A. Barrett-Wilt and Patrick H. Masson

*114 Evolution and targeting of Omp85 homologs in the chloroplast outer envelope membrane*

Philip M. Day, Daniel Potter and Kentaro Inoue

*132 Monogalactosyldiacylglycerol synthesis in the outer envelope membrane of chloroplasts is required for enhanced growth under sucrose supplementaon* Masato Murakawa, Mie Shimojima, Yuichi Shimomura, Koichi Kobayashi, Koichiro Awai and Hiroyuki Ohta

*145 The selective biotin tagging and thermolysin proteolysis of chloroplast outer envelope proteins reveals information on protein topology and assoction into complexes*

Hélène Hardré, Lauriane Kuhn, Catherine Albrieux, Juliette Jouhet, Morgane Michaud, Daphné Seigneurin-Berny, Denis Falconet, Maryse A. Block and Eric Maréchal

*160 Production of viable seeds from the seedling lethal mutant* **ppi2-2** *lacking the atToc159 chloroplast protein import receptor using plastic containers, and characterization of the homozygous mutant progeny*

Akari Tada, Fumi Adachi, Tomohiro Kakizaki and Takehito Inaba

# Emerging knowledge of the organelle outer membranes – research snapshots and an updated list of the chloroplast outer envelope proteins

Kentaro Inoue\*

*Department of Plant Sciences, University of California at Davis, Davis, CA, USA*

Keywords: Arabidopsis, chloroplast, membrane proteins, mitochondria, outer membrane

Mitochondria and chloroplasts are two distinct organelles essential for plant viability. They evolved from prokaryotic endosymbionts and share a common ancestor with extant Gram-negative bacteria (Gray et al., 1999; Gould et al., 2008). Successful conversion of the free-living prokaryotes to the cytoplasmic organelles via endosymbiosis required conservation and adaptation of the outer membranes to the dramatic change of surroundings. In prokaryotes, the outer membrane serves as a physical barrier that protects cells from the extracellular environment and allows import of necessary nutrients, and also directly participates in interaction with other organisms (Nikaido, 2003). As part of the semi-autonomous organelles, by contrast, the outer membranes of mitochondria and chloroplasts have gained ability to participate in intracellular communication and organelle biogenesis, i.e., import and export of various ions and metabolites, import of nuclear-encoded proteins, various metabolic processes including the biosynthesis of membrane lipids, and division and movement of the organelles that require physical interaction with cytoplasmic components (Breuers et al., 2011; Inoue, 2011; Duncan et al., 2013). Our understanding of the organelle outer membranes have been advanced greatly in the last decade or so, and the last eight years have seen about a three-fold increase in the number of proteins identified or predicted to be in the chloroplast outer envelope of Arabidopsis thaliana (Arabidopsis) [total 117 proteins listed in **Table 1**; compare 34 proteins in Inoue (2007)]. This Research Topic is intended to provide snapshots of recent research on the organelle outer membranes. It collects seven original research, three review and two method articles, which can be divided into four groups according to the subjects – (1) outer membrane protein targeting, (2) functions, targeting and evolution of protein import components, (3) lipid metabolism, and (4) method development.

#### Edited and reviewed by:

*Simon Gilroy, University of Wisconsin – Madison, USA*

> \*Correspondence: *Kentaro Inoue, kinoue@ucdavis.edu*

#### Specialty section:

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science*

Received: *27 March 2015* Accepted: *07 April 2015* Published: *30 April 2015*

#### Citation:

*Inoue K (2015) Emerging knowledge of the organelle outer membranes – research snapshots and an updated list of the chloroplast outer envelope proteins. Front. Plant Sci. 6:278. doi: 10.3389/fpls.2015.00278*

# 1. Protein Targeting to the Organelle Outer Membranes

All proteins identified so far in the organelle outer membranes are encoded in the nucleus (e.g., **Table 1**), and most of them use internal signals for targeting. This is distinct from the case for most nuclear-encoded proteins found inside the organelles: they are synthesized with N-terminal extensions, which are necessary and sufficient for proper targeting via the general pathway and cleaved upon import in the matrix (mitochondria) or stroma (chloroplasts). Lee et al. (2014) review the current knowledge of pathways and signals needed for targeting of three types of outer membrane proteins – signal-anchored (SA), tail-anchored (TA), and β-barrel proteins. SA and TA proteins are anchored to the membrane via a single transmembrane (TM) α-helix with either Nintermembrane space-Ccytosol (for SA) or Ncytosol-Cintermembrane space (for TA) orientation. β-Barrel proteins are integrated into the membrane via multiple TM-β-strands, whose formation appears to require evolutionarily conserved machinery in the membrane. Marty et al. (2014) have used a transient expression system with Nicotiana tabacum Bright Yellow-2 suspension cells to identify two types of

#### TABLE 1 | One hundred and seventeen proteins identified or predicted to be in the outer membrane of the Arabidopsis chloroplast envelope.<sup>a</sup>

#### TABLE 1 | Continued



*(Continued)*

*(Continued)*

#### TABLE 1 | Continued


*<sup>a</sup>Names and functional categories are based on literatures cited in this work and databases. See Supplementary Material* Table S1 *for the extended name (if any), the location curated by various databases, and other predicted properties based on the primary sequence for each protein.*

*<sup>b</sup>Arabidopsis gene identifier (AGI) number, which represents the systematic designation given to each locus, gene, and its corresponding protein product by The Arabidopsis Information Resource (TAIR: https://www.arabidopsis.org/).*

*<sup>c</sup>This list includes in total 117 proteins from two earlier review articles [32 from (i) Inoue (2007) and 44 from (ii) Breuers et al. (2011)], two recent chloroplast outer envelope proteomics studies [50 from (iii) Simm et al. (2013) and 58 from (iv) Gutierrez-Carbonell et al. (2014),] and five reports on individual outer envelope proteins [(v) PAP2 by Sun et al. (2012), (vi) SP1 by Ling et al. (2012), (vii) OEP23 by Goetze et al. (2015), (viii) PAPST1 by Xu et al. (2013), and (ix) pBrP by Lagrange et al. (2003)]. Note that Gigolashvili et al. (2012) predicts inner-envelope localization of PAPST1, and that the AGI number for pBrP was updated from At4g36655.*

*<sup>d</sup>YES indicates that the given protein was found in the chloroplast envelope proteomic studies (Ferro et al., 2003, 2010; Froehlich et al., 2003), which are listed in The Plant Proteome Database (PPDB: http://ppdb.tc.cornell.edu/) (Sun et al., 2009).*

*<sup>e</sup>Proteins found in the mitochondrial outer membrane by (x) Duncan et al. (2013) or (xi) Marty et al. (2014).*

targeting signals for mitochondria TA proteins. They have then performed database search, increasing the number of mitochondria TA proteins from 20 to 54. Interestingly, 16 of the mitochondria outer membrane proteins identified by the previous work (Duncan et al., 2013) and Marty et al. (2014) are also found in the chloroplast outer envelope membrane (**Table 1**). This may suggest the presence of targeting mechanisms and functions shared between the outer membranes of the two organelles.

# 2. Functions, Targeting and Evolution of Protein Import Components

The most-studied chloroplast outer membrane proteins are subunits of the TOC (translocon at the outer-envelope-membrane of chloroplasts) machinery, which catalyzes the general pathway to import nuclear-encoded precursor proteins from the cytosol. Among the TOC components are homologous GTPases Toc159 and Toc34, which recognize the precursors and regulate their import, and Toc75, which forms a protein conducting channel. In Arabidopsis, there are four Toc159 isoforms which show substrate selectivity, two catalytically redundant Toc34 isoforms, and one functional Toc75 encoded on chromosome III (**Table 1**). Demarsy et al. (2014) review the current knowledge about how these subunits function and regulate protein import. Richardson et al. (2014) summarize available results and discuss functions, targeting and assembly of TOC subunits. Importantly, both review articles recognize outstanding questions about the TOC components, including the mechanisms of precursor recognition and their insertion into the membrane. By biochemical assays using chloroplasts isolated from pea seedlings, radiolabeled precursor proteins and recombinant proteins, Chang et al. (2014) demonstrate interaction of Toc159 isoforms called Toc132/Toc120 with a chloroplast superoxide dismutase (FSD1) that was predicted to comprise an exceptionally short import signal but has been shown otherwise, and also map the interaction domains beyond the N terminus. The interaction of FSD1 with Toc132, but not with Toc159, was also demonstrated by a split-ubiquitin yeast twohybrid assay (Dutta et al., 2014). Grimmer et al. (2014) have used an in vivo approach, transiently producing GFP-tagged proteins in protoplasts of various Arabidopsis mutants and determining their N-terminal sequences by mass spectrometry analyses, and demonstrate that a plastid RNA binding protein is a substrate of Toc159. The Arabidopsis protoplast transient expression assay has also been used to define sequences required for targeting and membrane integration of a Toc159 ortholog (Lung et al., 2014). A previous genetic screening had demonstrated that Toc132 and Toc75 enhance root gravitropism signal transduction (Stanga et al., 2009). Strohm et al. (2014) now provide evidence supporting the involvement of plastids, instead of direct participation of TOC subunits, in the gravitropism signal transduction. Finally, Day et al. (2014) report phylogenetic relationships and in vitro targeting of the Toc75 homologs including the truncated forms of OEP80/Toc75-V, which are also known as P39 (Hsueh et al., 2014) and P36 (Nicolaisen et al., 2015) (**Table 1**).

# 3. Lipid Metabolism

Under phosphate starvation, phospholipids in the cell membranes, mainly those in extraplastidic compartments, are used as the source of free phosphates and substituted by galactolipids made in the chloroplast outer envelope. Murakawa et al. (2014) have used Arabidopsis mutants and feeding assays to show that the outer-envelope-dependent galactolipid synthesis is stimulated by sucrose supplementation and this stimulation in turn enhances utilization of the added sucrose for plant growth. This work nicely illustrates the physiological significance of the metabolic activity localized in the chloroplast outer envelope for plant growth and development.

# 4. Method Development

Hardre et al. (2014) report an attempt to apply biotin tagging and proteolysis to examine topology and membrane association of proteins in the spinach chloroplast. Although the work requires further refinement to achieve the desired specificity, the idea behind this approach is quite interesting. The toc159-null mutant is seedling-lethal thus has been examined as progenies of heterozygous parents. Tada et al. (2014) have established a method using Ziploc <sup>R</sup> container to grow the homozygous toc159 mutants on the sucrose-supplemented media to the point that viable seeds can be obtained. This cost-effective method should

# References


be useful to study not only the toc159-null plant but also other recessive lethal mutants of photosynthesis.

In summary, the collection highlights various questions about the organelle outer membranes and interdisciplinary approaches employed to address them. The future research should use these and other strategies to answer questions about the proteins of known functions, in particular those involved in protein homeostasis, as well as those of unknown functions (**Table 1**). The editor greatly acknowledges the excellent contributions of all the authors and constructive comments by expert reviewers to each of the articles.

# Acknowledgments

This work was supported by the Division of Molecular and Cellular Biosciences at the US National Science Foundation (Grant No. 1050602).

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00278/full

Table S1 | Extended names, curated locations and some other information of 117 proteins listed in Table 1.


plant-specific member of the TFIIB-related protein family. Mol. Cell. Biol. 23, 3274–3286. doi: 10.1128/MCB.23.9.3274-3286.2003


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Inoue. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

**REVIEW ARTICLE** published: 29 April 2014 doi: 10.3389/fpls.2014.00173

# Specific targeting of proteins to outer envelope membranes of endosymbiotic organelles, chloroplasts, and mitochondria

# *Junho Lee1 †, Dae Heon Kim1 † and Inhwan Hwang1,2 \**

<sup>1</sup> Cellular Systems Biology, Department of Life Sciences, Pohang University of Science and Technology, Pohang, South Korea <sup>2</sup> Division of Integrative Biosciences and Bioengineering, Pohang University of Science and Technology, Pohang, South Korea

#### *Edited by:*

Kentaro Inoue, University of California at Davis, USA

#### *Reviewed by:*

Ben Matthew Abell, Sheffield Hallam University, UK Hsou-min Li, Academia Sinica, Taiwan

#### *\*Correspondence:*

Inhwan Hwang, Cellular Systems Biology, Department of Life Sciences and Division of Integrative Biosciences and Bioengineering, Pohang University of Science and Technology, Hyojadong, Nam-Gu, Pohang 790-784, South Korea e-mail: ihhwang@postech.ac.kr

†Junho Lee and Dae Heon Kim contributed equally to this work. Chloroplasts and mitochondria are endosymbiotic organelles thought to be derived from endosymbiotic bacteria. In present-day eukaryotic cells, these two organelles play pivotal roles in photosynthesis and ATP production. In addition to these major activities, numerous reactions, and cellular processes that are crucial for normal cellular functions occur in chloroplasts and mitochondria. To function properly, these organelles constantly communicate with the surrounding cellular compartments. This communication includes the import of proteins, the exchange of metabolites and ions, and interactions with other organelles, all of which heavily depend on membrane proteins localized to the outer envelope membranes. Therefore, correct and efficient targeting of these membrane proteins, which are encoded by the nuclear genome and translated in the cytosol, is critically important for organellar function. In this review, we summarize the current knowledge of the mechanisms of protein targeting to the outer membranes of mitochondria and chloroplasts in two different directions, as well as targeting signals and cytosolic factors.

**Keywords: AKR2, β-barrel proteins, chloroplasts, endosymbiotic organelles, mitochondria, outer membrane proteins, signal-anchored proteins, tail-anchored protein**

### **INTRODUCTION**

Mitochondria and plastids are two endosymbiotic organelles that have contributed greatly to the evolution of present-day eukaryotic cells. Mitochondria exist in all eukaryotic cell types and are involved in apoptosis, respiratory ATP production, and iron-sulfur cluster assembly (Schleiff and Becker, 2011). By contrast, plastids exist in the plant and in algae and differentiate into multiple subtypes in plants depending on the cell type. Of these plastid-derived organelles, chloroplasts participate in numerous essential metabolic and cellular processes, including photosynthesis, amino acid and lipid metabolism, cell signaling, and host defense (Kessler and Schnell, 2009). According to the endosymbiont hypothesis, mitochondria, and plastids evolved from endosymbiotic bacteria, i.e., α-proteobacterium and cyanobacterium, respectively, in eukaryotic ancestor cells in a sequential manner, with mitochondria evolving first (John and Whatley, 1975; Cavalier-Smith, 2000; Dolezal et al., 2006). During their organellogenesis, one key event was the massive transfer of genetic information from the endosymbionts to the host cell nucleus. Currently, more than 95% of mitochondrial and plastid proteins in present-day eukaryotic cells are encoded by the nuclear genome, synthesized on cytosolic ribosomes, and imported into these organelles (Leister, 2003; Sickmann et al., 2003; Schleiff and Becker, 2011).

Like their ancestors, chloroplasts, and mitochondria contain two envelope membranes that function as chemical and physical barriers to separate organelle-localized metabolic reactions and processes from the cytosol. At the same time, to ensure their organellar functions as part of the cellular system, mitochondria and chloroplasts have evolved a way to communicate with their surroundings at the two envelope membranes, often employing direct physical interactions with other cellular compartments. These communication processes include import of nuclear-encoded proteins and exchanges of metabolites and ions (Inoue, 2011). The nuclear genome encodes all chloroplast and mitochondrial outer membrane and intermembrane space (IMS) proteins as well as most inner membrane and interior proteins (Sato et al., 1999; Neupert and Herrmann, 2007; Schmidt et al., 2010). Of these organellar proteins, outer envelope proteins play crucial roles in many cellular processes such as protein import into organelles, organelle movement and division, and lipid synthesis. These processes are essential not only for the function of their cognate organelles but also for plant development and growth under normal conditions, as well as survival under adverse environmental conditions (Inoue, 2011). Moreover, the mitochondrial outer membrane harbors proteins that control central cellular events such as apoptosis and innate immunity (Walther and Rapaport, 2009).

For all of these processes to occur successfully and efficiently, the specific targeting of organellar proteins to their target must first occur following translation on the cytosolic ribosomes. The outer membrane proteins are a group of heterogeneous proteins that can be divided into multiple types, including α-helical transmembrane domain (TMD)-containing proteins and β-barrel proteins consisting of multiple transmembrane β-strands, based on their structure. Moreover, TMD-containing proteins arefurther divided into four types, including those with an N-terminal TMD, middle TMD, a C-terminal TMD, and multi-TMDs, according to their

topology. These different types of membrane proteins are targeted to their destinations by different mechanisms (Hofmann and Theg, 2005; Walther and Rapaport, 2009; Kim and Hwang, 2013). To understand the targeting of these proteins, it is essential to identify their targeting signals as well as the molecular machinery involved in this targeting. Recently, significant progress has been made in the identification of the targeting signals of membrane proteins (Walther et al., 2009b; Dhanoa et al., 2010; Lee et al., 2011; Weis et al., 2013). By contrast, limited information is available about the machinery used for targeting. In this review, we mainly focus on recent advances in understanding the cytosolic events of protein targeting to the outer envelope membranes in two endosymbiotic organelles in plants, i.e., chloroplasts and mitochondria.

### **TARGETING SIGNALS OF CHLOROPLAST AND MITOCHONDRIAL OUTER MEMBRANE PROTEINS TARGETING SIGNALS OF CHLOROPLAST AND MITOCHONDRIAL SIGNAL-ANCHORED PROTEINS**

Signal-anchored (SA) proteins are a class of membrane proteins that contain a single TMD at their N-terminal regions. SA proteins are involved in important biological processes, functioning as receptors of chloroplast and mitochondrial precursor proteins and biosynthetic enzymes of lipid membranes (Dukanovic and Rapaport, 2011; Inoue, 2011). Mitochondrial and chloroplast SA proteins lack a cleavable targeting sequence, such as a presequence or transit peptide, which mediates specific targeting to the mitochondria or chloroplasts, respectively. Instead, the TMD functions as the targeting signal required for targeting to the correct location, which also occurs with endoplasmic reticulum (ER)-targeted SA proteins (Rapoport, 2007; Schleiff and Becker, 2011). The majority of studies on the targeting mechanisms of mitochondrial SA proteins have been carried out with mammalian and yeast proteins. These studies did not reveal conserved sequence motifs of TMDs that are involved in the determination of targeting specificity. Instead, moderate hydrophobicity of the TMD is one feature that is critical for the mitochondria targeting of SA proteins (Kanaji et al., 2000). When the TMD hydrophobicity of rat Tom20 (rTom20) was increased by introducing more hydrophobic leucine residues, the mutant form of rTom20 was mistargeted to the ER instead of the mitochondrial. Similarly, replacement of the TMD of yeast Tom20 (yTom20) with the more hydrophobic TMD of an ER SA protein inhibited mitochondrial targeting (Waizenegger et al., 2003). The importance of moderate hydrophobic TMD for mitochondrial targeting was confirmed in plant mitochondrial SA proteins in *Arabidopsis* protoplasts (Lee et al., 2011). Intriguingly, the TMDs of chloroplast SA proteins also have moderate hydrophobicity. Increasing the hydrophobicity of the TMD in atToc64 altered its localization from the chloroplast to the plasma membrane (PM; Lee et al., 2004), indicating that the moderately hydrophobic TMD is important for the targeting of SA proteins to both endosymbiotic organelles. However, the concept of moderate hydrophobicity is too ambiguous to be used to differentiate mitochondrial/chloroplast SA proteins from ER SA proteins. A recent study on the targeting of a large number of ER, mitochondrial, and chloroplast SA proteins of *Arabidopsis*

revealed that the Wimley and White (WW) hydrophobicity scale is most accurate for differentiating targeting specificity based on the hydrophobicity value of the TMD; more than 85% of all ER SA proteins have a hydrophobicity value greater than 0.4 on the WW hydrophobicity scale, and more than 89% of the mitochondrial and chloroplast SA proteins have hydrophobicity values below 0.4 on the WW hydrophobicity scale (Lee et al., 2011). This rule also applies to most mammalian and yeast mitochondrial SA proteins, suggesting that it applies to all eukaryotic cells. Another critical motif for the targeting of mitochondrial and chloroplast SA proteins is the C-terminal positively charged flanking region (CPR) of the TMD. CPRs usually contain three or more basic residues (arginines and/or lysines) within a short C-terminal flanking region of the TMD (Rapaport, 2003; Lee et al., 2011). Both the moderate hydrophobic TMD and CPR are required for targeting to mitochondria and chloroplasts. Similarly, the basic residues are crucial for the proper targeting of mammalian mitochondrial SA proteins. When basic residues in the CPRs of rTom20 and rTom70 are substituted with serine residues, the mutant proteins are targeted to the ER or Golgi but not to the mitochondria (Kanaji et al., 2000; Suzuki et al., 2002). However, unlike mammalian mitochondrial SA proteins, the CPR is not crucial for mitochondrial targeting in yeast; substitution of basic residues with serines is tolerated (Waizenegger et al., 2003). However, substitution of basic residues with acidic residues inhibits mitochondrial targeting in yeast, indicating that the amino acid composition of the C-terminal flanking region is an important determinant for the targeting specificity of SA proteins (Waizenegger et al., 2003). The CPR is also crucial for the targeting of chloroplast SA proteins. Substitution of basic residues with glycines alters the localization of the chloroplast SA proteins OEP7 and atToc64 to the PM in *Arabidopsis* protoplasts. However, to date, the exact definition of the CPR has not been established. Three basic residues in the CPR are a minimal requirement (Lee et al., 2001, 2004). However, the density of basic amino acid residues in a short C-terminal flanking region appears to be crucial for CPR function. In addition, other factors such as the amino acid composition of the CPR and its distance from the TMD are also important features in defining the CPR. Unlike mammalian cells and yeast, plant cells contain two endosymbiotic organelles, chloroplast, and mitochondria. Therefore, another challenging issue in plants is how they specifically target chloroplast or mitochondrial SA proteins, which have similar targeting signals consisting of the moderately hydrophobic TMD and the CPR.

#### **TARGETING SIGNALS OF CHLOROPLAST AND MITOCHONDRIAL TAIL-ANCHORED PROTEINS**

Tail-anchored (TA) proteins are another class of chloroplast and mitochondrial outer membrane proteins that contain a single TMD at their C-terminal region. Like SA proteins, TA proteins do not harbor a cleavable signal sequence for their targeting (Borgese et al., 2007). The targeting of mitochondrial TA proteins has been studied in mammalian and yeast systems, and these studies have suggested that the targeting signal of mitochondrial TA proteins consists of a short TMD with moderate hydrophobicity and basic residues in the C-terminal sequence (CTS) following the TMD (Rapaport, 2003; Dukanovic and Rapaport, 2011). In plants, the targeting signal of mitochondrial TA proteins has been studied using cytochrome b5 isoforms of tung (*Aleurites fordii*; Hwang et al., 2004). Like mammalian and yeast mitochondrial TA proteins, basic residues in the CTS following the TMD are important for mitochondrial targeting. In addition, the amino acid composition of the TMD is also important for mitochondrial targeting (Hwang et al., 2004).

Many chloroplast TA proteins have been identified (**Table 1**), but the targeting of most of these proteins has not been analyzed in detail. Chloroplast TA proteins do not seem to share any conserved sequence for targeting specificity. A noticeable feature of these proteins is that the hydrophobicity value of TMDs of chloroplast TA proteins appears to vary significantly compared to that of mitochondrial TA proteins containing a moderately hydrophobic TMD. It is possible that the hydrophobicity of TMD is not an important factor for determining chloroplast targeting. Instead, plants have a unique targeting mechanism, as demonstrated for the GTPase domain that acts as a targeting signal of Toc33 and Toc34 (Dhanoa et al., 2010), as described below. In addition, the net charge in the CTS of chloroplast TA proteins does not exhibit any trend, although the basic net charge is an important factor for the proper targeting of mitochondrial TA proteins. Still, basic residues are present in both sides of TMDs to produce a positive net charge. More detailed analysis is required to define the relationship between the net charge in the CTS and the targeting specificity of chloroplast TA proteins.

Only a few of these chloroplast TA proteins have been studied in detail. These include Toc33 and Toc34, which are involved in the import of transit peptide-containing precursors into chloroplasts. Toc33 and 34 have two domains, the N-terminal GTPase and the C-terminal TMD; the TMD is involved in anchoring to the outer membrane of the chloroplast. Unlike mitochondrial TA proteins, the C-terminal region including the TMD of Toc33 or 34 is necessary but not sufficient for chloroplast targeting *in vivo* (Dhanoa et al., 2010). In addition to the TMD, the GTPase domain is also necessary for chloroplast localization. Toc159 is another chloroplast TA protein with a GTPase domain; the GTPase domain alone binds to the surface of the chloroplast *in vitro* (Smith et al., 2002). However, given that a truncated form of Toc159 lacking the GTPase domain also binds to the surface of the chloroplast, the role of the GTPase domain in the targeting mechanism of TA proteins seems to be restricted to specific cases, such as Toc33 and Toc34. The GTPase domain of Toc33 interacts with that of Toc159 (Bauer et al., 2002; Smith et al., 2002); thus the interaction between the two GTPase domains has been suggested to be involved in the targeting of Toc33 to the outer membrane of the chloroplast.

The chloroplast targeting mechanism of OEP9 is slightly different from that of Toc33 and Toc34. In the case of OEP9, the TMD and CTS are necessary and sufficient for targeting to the chloroplast (Dhanoa et al., 2010). In fact, replacing the CTS of tung mitochondrial cytochrome b5 with the CTS of OEP9 causes this protein to be targeted to the chloroplast. Similarly, the CTS of OEP9 mediates chloroplast localization of a truncated form of Toc33 lacking the GTPase domain. Analysis of various point mutants has suggested that the net charge or charge distribution in the CTS of OEP9 is crucial for meditating chloroplast targeting (Dhanoa et al., 2010). However, the physical–chemical property of the CTS of OEP9 may not be generally applied to the CTS of other proteins involved in chloroplast targeting. For example, when the CTS of *Arabidopsis* chloroplast cytochrome b5 is replaced with that of tung mitochondrial cytochrome b5, the chimeric form of tung mitochondrial cytochrome b5 is still targeted to the mitochondria (Hwang et al., 2004). The CTS of Toc33 even inhibits the chloroplast targeting of OEP9. These results suggest that the CTSs of *Arabidopsis* chloroplast TA proteins are not always sufficient to support chloroplast targeting, and additional sequence information is necessary depending on the specific protein.



Chloroplast and mitochondrial outer membrane proteins are classified as signal anchored (SA), tail anchored (TA), or β-barrel proteins according to their topology. SA and TA proteins were searched in the UniProtKB database (http://www.uniprot.org/) based on the criteria of experimentally confirmed localization and the existence of a predicted single transmembrane domain. Information about β-barrel proteins was obtained from the literature, which describes their localizations, as determined through proteomic analysis. Proteins indicated by asterisks were not searched in the UniProtKB database but their localizations were confirmed by fluorescence microscopy of GFP-tagged fusion proteins.

#### **TARGETING SIGNALS OF CHLOROPLAST AND MITOCHONDRIAL β-BARREL PROTEINS**

β-barrel membrane proteins, comprising multiple transmembrane β-strands, are also found in chloroplast and mitochondrial outer membranes (Walther et al., 2009b). These proteins are involved in transporting metabolites, ions, or precursor proteins at the outer membranes of chloroplasts and mitochondria. Mitochondrial β-barrel proteins do not contain cleavable targeting sequences involved in delivery to the mitochondria from the cytosol after translation (Rapaport, 2003). The targeting signal is not restricted to a specific region but is dispersed throughout the polypeptide sequence (Court et al., 1996; Rapaport and Neupert, 1999). Based on these features, it has been proposed that the targeting information is contained in the secondary and/or tertiary structures rather than in a specific sequence motif of the primary sequence (Walther et al., 2009b). Interestingly, bacterial β-barrel proteins are targeted to, and properly inserted into, the outer membrane of mitochondria in yeast (Walther et al., 2009a). This result suggests that the mitochondrial targeting information of β-barrel proteins has been derived from that of the ancestral β-barrel proteins.

Chloroplasts also contain β-barrel proteins in the outer envelope membrane. Toc75 is a chloroplast β-barrel protein that functions as the channel for translocation of transit peptidecontaining chloroplast precursor proteins across the outer membrane (Tranel and Keegstra, 1996). Unlike mitochondrial β-barrel proteins, Toc75 contains a cleavable targeting sequence, the transit peptide that is essential for targeting to the chloroplast after translation in the cytosol (Tranel and Keegstra, 1996). Another β-barrel protein, Toc75-V/OEP80, an isoform of Toc75, was also predicted to have a transit peptide at its N-terminus. However, the N-terminal region of Toc75-V/OEP80 is dispensable for chloroplast targeting (Patel et al., 2008). Similarly, other β-barrel proteins such as OEP24 and OEP37, but not OEP21, were predicted to have transit peptides; however, whether the predicted transit peptides are involved in chloroplast targeting remains to be experimentally confirmed. Overall it remains largely elusive if the targeting of β-barrel proteins to chloroplasts is similar to that to mitochondria.

Most functional studies on the biogenesis of β-barrel proteins have been performed on bacteria, mammals and yeast but, unfortunately, specific cytosolic components for the sorting and targeting of β-barrel proteins of chloroplasts or mitochondria have not yet been identified in plants, mammals or yeast. Intriguingly, with the exception of Toc75, all outer membrane proteins identified to date, including β-barrel proteins in the chloroplast and mitochondria, are synthesized at the mature size without a cleavable signal sequence (Rapaport, 2003; Hofmann and Theg, 2005; Chacinska et al., 2009; Li and Chiu, 2010; Schleiff and Becker, 2011). Chloroplast-targeted Toc75 has an N-terminal signal sequence consisting of two parts. The first part is a typical transit peptide, whereas the second part comprises a region rich in hydrophobic residues and a polyglycine stretch (Inoue and Keegstra, 2003). The first part of the targeting signal is cleaved in the stroma by stromal processing peptidase, whereas the second part is removed by a membrane-bound peptidase known as plastidic type I signal peptidase 1 (Inoue et al., 2005; **Figure 1**).

However, the exact mechanism of its insertion into the chloroplast outer membrane is still unknown.

In the case of the yeast mitochondria, β-barrel proteins are initially recognized by the TOM complex, consisting of Tom20 and Tom70. These proteins are then translocated into the IMS through the import channel Tom40. In the IMS, the chaperone complexes Tim9-Tim10 and Tim8-Tim13 bind to β-barrel proteins (Hoppins and Nargang, 2004; Wiedemann et al., 2004) and participate in the transport of these proteins to the sorting and assembly machinery (SAM complex) on the mitochondrial outer membrane (Paschen et al., 2003; Wiedemann et al., 2003). The β-barrel proteins are inserted into a hydrophilic environment within the SAM complex. These proteins subsequently bind to Sam35 and Sam37, two partner proteins of Sam50, which promotes the release of the β-barrel proteins into the lipid phase of the outer membrane (Paschen et al., 2003; Wiedemann et al., 2003; Gentle et al., 2004; Chan and Lithgow, 2008; Kutik et al., 2008). Mitochondrial import 1 (Mim1) in yeast, which is involved in the membrane insertion of mitochondrial SA protein, also promotes the assembly of β-barrel proteins by transiently binding to the SAM complex, and this protein may modulate SAM function (Ishikawa et al., 2004; Becker et al., 2008; Popov-Celeketic et al., 2008; Chacinska et al., 2009). Moreover, Mdm10 in yeast may promote the assembly of the TOM complex (Boldogh et al., 2003; Meisinger et al., 2007; **Figure 1**); however, the exact mechanism of the insertion and release of β-barrel proteins is not yet known.

# **CYTOSOLIC FACTORS OF CHLOROPLAST AND MITOCHONDRIAL OUTER MEMBRANE PROTEINS CYTOSOLIC TARGETING FACTORS FOR CHLOROPLAST AND MITOCHONDRIAL SIGNAL-ANCHORED PROTEINS**

Chloroplast and mitochondrial SA proteins are targeted posttranslationally. Currently, an important question is whether any cytosolic factor(s) play(s) a role in the targeting of SA proteins from the cytosol to these organelles. In the case of protein targeting to the ER in eukaryotes, signal recognition particles (SRPs) mediate this targeting in a co-translational manner (Keenan et al., 2001). Recently, ankyrin-repeat-containing protein 2, consisting of two isoforms, AKR2A and AKR2B, has been identified as a cytosolic factor for targeting of SA proteins to the chloroplast outer membrane (Bae et al., 2008; Bédard and Jarvis, 2008; Kim et al., 2011). AKR2A interacts with the targeting signals of chloroplast outer membrane proteins (consisting of the TMD and the CPR) *in vitro* and *in vivo*, but not the targeting signals of proteins destined for endomembrane organelles (Lee et al., 2001, 2004). Additionally, AKR2A displays chaperone activity and prevents non-specific aggregation of its client proteins by binding to the hydrophobic TMD. Chaperone activity should be an integral part of cytosolic targeting factors for the post-translational targeting of membrane proteins because these factors can use this activity to keep their clients in an insertion-competent form in the cytosol by preventing non-specific aggregate formation, proteolytic degradation, or unproductive interactions with other proteins before organellar membrane proteins are delivered to the target membranes (Flores-Pérez and Jarvis, 2013; Kim and Hwang, 2013).

In addition, AKR2 binds to chloroplasts through its C-terminal ankyrin-repeat domain (ARD) and facilitates insertion of its client proteins into the chloroplast outer membrane, where Toc75 assists with their insertion (Tu et al., 2004). Recently, it has been shown that AKR2 is associated with sHsp17.8, a member of the cytosolic Class I small heat shock protein (sHsp) family (Kim et al., 2011; **Figure 2**). Interestingly, sHsp17.8 (as a dimer) binds to both AKR2 and chloroplasts. Through these interactions, sHsp17.8 facilitates AKR2-mediated targeting of SA proteins to the chloroplast outer membrane, suggesting that sHsp17.8 functions as a cofactor of AKR2 during protein targeting to the chloroplast outer membrane.

It remains elusive when and how AKR2 specifically recognizes its SA clients in the cytosol during protein targeting to the chloroplast outer membrane. When nascent organellar proteins emerge from the exit tunnel of ribosomes during translation, they may interact with specific targeting factors and/or chaperones that assist in targeting to the proper location of the cell (Ullers et al., 2003; Schlünzen et al., 2005; Spreter et al., 2005). In the case of the ER in mammalian cells and yeast, the SRP recognizes the hydrophobic signal sequence of ER luminal and SA proteins during translation (Keenan et al., 2001). Moreover, Bat3/TRC35/Ubl4A complexes, which are pre-targeting factors for TA proteins of the ER in mammalian cells, also associate with ribosomes (Fleischer et al., 2006; Jonikas et al., 2009; Mariappan et al., 2010), suggesting that ribosomes serve as a platform for the docking of cytosolic factors involved in organellar protein targeting. These studies raise the possibility that AKR2 recognizes the chloroplast outer membrane-targeted SA proteins at the ribosomes during translation. Additionally, another important question is how AKR2 recognizes chloroplasts as the target organelle. The C-terminal ARD of AKR2 is involved in chloroplast binding, and its binding is assisted by sHsp17.8 (Bae et al., 2008; Kim et al., 2011). However, in the absence of sHsp17.8, AKR2A still binds to the chloroplast *in vitro*, raising the possibility that AKR2A alone interacts with the chloroplast. This result strongly suggests that

certain factors exist on the chloroplast outer membrane for AKR2 recruitment.

For precursor protein import into endosymbiotic organelles, many soluble factors in plants, mammalian cells and yeast have been identified that include Hsp70, Hsp90, 14-3-3 proteins, mitochondrial stimulating factor (MSF), arylhydrocarbon receptorinteracting protein (AIP), nascent polypeptide-associated complex (NAC), and ribosome-associated complex (RAC; Hachiya et al., 1994; Fünfschilling and Rospert, 1999; May and Soll, 2000; Gautschi et al., 2001; Yano et al., 2003; Young et al., 2003; Qbadou et al., 2006; Schemenewitz et al., 2009). These cytosolic factors with a chaperone activity may play an important role in keeping preproteins in import-competent status by preventing their aggregation or degradation, or minimizing unproductive interactions with other proteins in the cytosol (Flores-Pérez and Jarvis, 2013; Kim and Hwang, 2013). Hsp70 can act alone or in cooperation with other soluble factors such as Hsp90, 14-3-3 proteins or AIP (in mammalian), and these factors may also play a role in facilitating delivery of preproteins to receptors localized on the surface of chloroplast or mitochondrial outer membranes (Hachiya et al., 1994; May and Soll, 2000; Yano et al., 2003; Young et al., 2003; Qbadou et al., 2006; Schemenewitz et al., 2009). In

yeast or mammalian cells, NAC, RAC, AIP, and MSF can stimulate the import of preproteins into mitochondria (Hachiya et al., 1994; Fünfschilling and Rospert, 1999; Gautschi et al., 2001; Yano et al., 2003). Currently, no cytosolic factors have been identified for targeting mitochondrial outer membrane SA proteins. They also contain a hydrophobic TMD that serves as an anchor to the outer membrane and also functions as a targeting signal. Therefore, a mechanism should exist that solves the problem of non-specific aggregate formation of the hydrophobic TMD of mitochondrial outer membrane SA proteins in the aqueous cytosol, raising the possibility that a yet unidentified factor(s) may be involved in the delivery of SA proteins to the mitochondria. Additional information is available about protein factors that are involved in the steps that occur at the mitochondrial outer membrane. Mim1 in yeast is involved in the insertion of certain SA proteins into the mitochondrial outer membrane (Becker et al., 2008; Hulett et al., 2008; Popov-Celeketic et al., 2008; Chacinska et al., 2009; **Figure 2**). Mim1 catalyzes the docking step for an α-helical transmembrane segment of Tom20 and Tom70 onto a membrane protein complex formed around a βbarrel protein Tom40 (Becker et al., 2008; Hulett et al., 2008). Mim1 forms homodimers in the mitochondrial outer membrane

via its transmembrane segment, which contains two consecutive GXXXG/A motifs. The two GXXXG/A motifs are crucial for formation of dimers and also for integration of Tom20 into the mitochondrial outer membrane (Popov-Celeketic et al., 2008). In addition, the core of the TOM complex, Tom40, is involved in the insertion of certain SA proteins into the mitochondria in yeast (Ahting et al., 2005). However, most SA proteins do not seem to require the TOM complex for this insertion (Schneider et al., 1991; Schlossmann and Neupert, 1995; Ahting et al., 2005).

#### **CYTOSOLIC TARGETING FACTORS FOR CHLOROPLAST AND MITOCHONDRIAL TAIL-ANCHORED PROTEINS**

The location of the hydrophobic TMD poses an additional complication during the targeting of TA proteins because the TMD must be recognized post-translationally (Chartron et al., 2012). For ER-localized TA proteins, the guided entry of the TA protein (GET) pathway in yeast is used to deliver these proteins to the ER membrane. This pathway starts with the transfer of TA proteins from ribosomes to the "pre-targeting" factor (Get4/Get5/Sgt2), the sorting complex, which then loads them onto the targeting factor Get3 (Mariappan et al., 2010; Wang et al., 2010; Chartron et al., 2012). The central protein Get3 utilizes nucleotide-linked conformational changes in the loading and targeting of client proteins; a "closed" dimer of Get3 binds to the TA client via its large hydrophobic groove. The GET3-client complex is recruited to the ER membrane by binding to the Get1/Get2 receptor complex.

By contrast to the ER targeting of TA proteins, little information is available about the molecular machinery and mechanisms of TA protein targeting to the chloroplast and mitochondrial outer membranes. Chloroplast outer membrane TA proteins such as OEP9 and Toc33/Toc34 also interact with the cytosolic targeting factor AKR2, which is involved in targeting SA proteins to the chloroplast outer membrane (Bae et al., 2008; Dhanoa et al., 2010). However, the targeting signal of Toc33/Toc34 is different from that of OEP9. In the case of OEP9, the targeting signal consists of a 32 amino acid-long hydrophilic CTS and the TMD (Dhanoa et al., 2010). Toc33 and Toc34 also have single TMDs at their C-terminal ends, followed by CTS. However, their targeting to the chloroplast outer membrane depends on almost the entire protein sequence rather than TMD and CTS, raising the possibility that the targeting of TA proteins to the chloroplast outer membrane may not solely depend on AKR2. Recently, an arsenite transporter, ARSA1, has been identified as a cytosolic factor that mediates biogenesis and targeting of Toc34 from the cytosol to the chloroplast outer membrane in *Chlamydomonas reinhardtii* (Formighieri et al., 2013). Intriguingly, TRC40 and GET3 are also homologs of ARSA1 in mammalian cells and yeast, respectively. By contrast to ARSA1, these proteins are involved in the targeting of TA proteins to the ER membrane. Interestingly, only one ARSA1 homolog gene has been found in humans and yeast. However, two and three ARSA homolog genes are present in *C. s reinhardtii* and *Arabidopsis*, respectively (Chartron et al., 2012; Formighieri et al., 2013; **Figure 3**). Since ARSA1 is involved in the targeting of chloroplast proteins, other ARSA homologs may be involved in targeting TA proteins to the ER in plants, which is similar to

that in animals and yeast. However, it is not clear whether AKR2 is involved in the targeting of TA proteins to the chloroplast, or whether AKR2 communicates with ARSA homologs for targeting TA proteins to the chloroplast outer membrane in plants. On the other hand, certain chloroplast TA proteins translated in wheat germ extracts were efficiently targeted to chloroplast outer membranes with high fidelity (Kriechbaumer and Abell, 2012). Based on these results, they suggested that cytosolic factors may play a minor role, if there is any, in TA protein targeting to chloroplasts, and that targeting of proteins to chloroplast envelope membrane is primarily dependent on events at the outer membrane. However, in the *in vitro* assay systems, it is difficult to discriminate between nonspecific association of the hydrophobic transmembrane segment with the membrane and physiological membrane integration (Borgese et al., 2003). In the cellular environment, cytosolic targetingfactors may be required to protect the targeting signal including hydrophobic TMD and to keep nascent TA proteins in insertioncompetent status by preventing their aggregation, or minimizing unproductive interactions with other proteins in the cytoplasm (Ellis and Minton, 2006; Flores-Pérez and Jarvis, 2013; Kim and Hwang, 2013).

In the case of mitochondrial outer membrane TA proteins, a bioinformatics approach has predicted that 142 out of 454 TA proteins in *Arabidopsis* may localize to the mitochondria (Kriechbaumer et al., 2009). Limited information is available about the molecular machinery that directs C-terminal α-helical TMDcontaining proteins into the mitochondrial outer membrane from the cytosol. To date, no cytosolic components have been identified. Although the TOM complex or Mim1 appear to be required for the insertion of certain mitochondrial outer membrane TA proteins (Motz et al., 2002; Thornton et al., 2010), these mitochondrial TA proteins may use an unidentified insertase, or they may be spontaneously inserted into the outer membrane of the mitochondria. However, in Bak and Bcl-XL in mammalian cells, and Fis1 in yeast, the C-terminal transmembrane segment is sufficient for mitochondrial outer membrane targeting and none of the import components at the outer membrane is involved in the insertion (Setoguchi et al., 2006; Kemper et al., 2008). Interestingly, the unique lipid composition of the mitochondrial outer membrane appears to contribute to the selectivity in the targeting of membrane proteins (Kemper et al., 2008). In fact, the unique lipid composition of the membrane has been shown to be important for efficient insertion of TA proteins into the mitochondria (Setoguchi et al., 2006; **Figure 3**). However, targeting of full length Bak and Bcl-XL in mammalian cells requires cytosolic factor(s) with a chaperone activity, suggesting that the targeting is critically dependent on the folding status of the N-terminal cytosolic domains (Setoguchi et al., 2006).

#### **CONCLUDING REMARKS**

In this review, we summarized the mechanisms of protein targeting to the outer envelope membranes of two endosymbiotic organelles, chloroplasts, and mitochondria. Proteins localized to the outer membranes of these two organelles are involved in various functions that are essential for the physiology of these organelles. Numerous studies have contributed to our

understanding of various aspects of the physiological roles of these organelles. One of these aspects involves how proteins are specifically targeted to these endosymbiotic organelles, and we currently have a fairly good understanding of how proteins are targeted to these organelles at the molecular level. Owing to studies of mitochondrial protein biogenesis in animal cells and yeast, more is known about mitochondria targeting than chloroplast targeting, yet despite this progress, there are still many aspects that we do not fully understand at the molecular level. These include how sorting of the organellar proteins occurs in the cytosol, how these proteins navigate to their specific organelles, and how the proteins are inserted into these organelles. The answers to these questions will help elucidate how organellar protein biogenesis occurs at the molecular level and how these organelles function in eukaryotic cells. Moreover, answering these questions may also provide clues to one of the most challenging and intriguing questions about these organelles, that is, how specific targeting mechanisms have been established during the organellogenesis of chloroplasts and mitochondria. It is likely that the cellular environment of the host cells at the time of their organellogenesis has been incorporated into the targeting mechanisms; thus the targeting signals and

molecular machinery involved in the targeting of chloroplast and mitochondrial outer membrane proteins in present-day eukaryotic cells may contain "molecular fossils" that provide insights into how organellogenesis of these two organelles has occurred during evolution.

# **ACKNOWLEDGMENT**

This work was supported by Samsung Research Funding Center of Samsung Electronics under Project Number SRFC-MA1301-04.

#### **REFERENCES**


insertion and assembly of signal anchored receptors. *J. Biol. Chem.* 283, 120–127. doi: 10.1074/jbc.M706997200


region of the transmembrane domain of signal-anchored proteins play critical roles in determining their targeting specificity to the endoplasmic reticulum or endosymbiotic organelles in *Arabidopsis* cells. *Plant Cell* 23, 1588–1607. doi: 10.1105/tpc.110.082230


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 March 2014; accepted: 10 April 2014; published online: 29 April 2014.*

*Citation: Lee J, Kim DH and Hwang I (2014) Specific targeting of proteins to outer envelope membranes of endosymbiotic organelles, chloroplasts, and mitochondria. Front. Plant Sci. 5:173. doi: 10.3389/fpls.2014.00173*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Lee, Kim and Hwang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# New insights into the targeting of a subset of tail-anchored proteins to the outer mitochondrial membrane

*Naomi J. Marty1‡, Howard J. Teresinski 1‡, Yeen Ting Hwang1† ‡, Eric A. Clendening1‡, Satinder K. Gidda1, Elwira Sliwinska1,2, Daiyuan Zhang3 †, Ján A. Miernyk4, Glauber C. Brito5, David W. Andrews 6, John M. Dyer <sup>3</sup> and Robert T. Mullen1 \**

*<sup>1</sup> Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada*

*<sup>2</sup> Department of Plant Genetics, Physiology and Biotechnology, University of Technology and Life Sciences in Bydgoszcz, Bydgoszcz, Poland*

*<sup>3</sup> United States Department of Agriculture, Agricultural Research Service, US Arid-Land Agricultural Research Center, Maricopa, AZ, USA*

*<sup>4</sup> United States Department of Agriculture, Agricultural Research Service, Plant Genetics Research Unit, University of Missouri, Columbia, MO, USA*

*<sup>5</sup> Instituto do Cancer do Estado de Sao Paulo, Fundacao Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, Brazil*

*<sup>6</sup> Sunnybrook Research Institute and Department of Biochemistry, University of Toronto, Toronto, ON, Canada*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Emanuela Pedrazzini, Consiglio Nazionale delle Ricerche, Italy Nica Borgese, University of Catanzaro "Magna Graecia," Italy*

#### *\*Correspondence:*

*Robert T. Mullen, Department of Molecular and Cellular, Biology, University of Guelph, Room 4470 Science Complex, 488 Gordon Street, Guelph, ON N1G 2W1, Canada*

*e-mail: rtmullen@uoguelph.ca*

#### *†Present address:*

*Yeen Ting Hwang, Prairie Plant Systems Inc., Saskatoon, Canada; Daiyuan Zhang, Del Mar College, Corpus Christi, USA*

*‡These authors have contributed equally to this work.*

Tail-anchored (TA) proteins are a unique class of functionally diverse membrane proteins defined by their single C-terminal membrane-spanning domain and their ability to insert post-translationally into specific organelles with an Ncytoplasm-Corganelle interior orientation. The molecular mechanisms by which TA proteins are sorted to the proper organelles are not well-understood. Herein we present results indicating that a dibasic targeting motif (i.e., -R-R/K/H-X{X=E} ) identified previously in the C terminus of the mitochondrial isoform of the TA protein cytochrome *b*5, also exists in many other *A. thaliana* outer mitochondrial membrane (OMM)-TA proteins. This motif is conspicuously absent, however, in all but one of the TA protein subunits of the translocon at the outer membrane of mitochondria (TOM), suggesting that these two groups of proteins utilize distinct biogenetic pathways. Consistent with this premise, we show that the TA sequences of the dibasic-containing proteins are both necessary and sufficient for targeting to mitochondria, and are interchangeable, while the TA regions of TOM proteins lacking a dibasic motif are necessary, but not sufficient for localization, and cannot be functionally exchanged. We also present results from a comprehensive mutational analysis of the dibasic motif and surrounding sequences that not only greatly expands the functional definition and context-dependent properties of this targeting signal, but also led to the identification of other novel putative OMM-TA proteins. Collectively, these results provide important insight to the complexity of the targeting pathways involved in the biogenesis of OMM-TA proteins and help define a consensus targeting motif that is utilized by at least a subset of these proteins. -

**Keywords:** *A. thaliana,* **dibasic motif, mitochondria, outer mitochondrial membrane, tail anchored, targeting signal**

### **INTRODUCTION**

Tail-anchored (TA) proteins are a unique class of proteins integral to all cellular membranes and share the defining characteristic of a single transmembrane-domain (TMD) at or near their C terminus (Kutay et al., 1999). As a consequence of this unique structural feature, the TMD of a TA protein emerges from the ribosome only after the termination of translation. Thus, the sorting and insertion of a nascent TA protein are *a priori* posttranslational. The TA proteins are therefore distinct from membrane proteins that can also possess a C-terminal TMD, but, in addition, contain another sequence that initiates translocation into the endoplasmic reticulum (ER) via the classical signal recognition particle (SRP)/Sec61 co-translational pathway (Grudnik et al., 2009). The C-terminal TMD of a TA protein also dictates its characteristic membrane orientation, whereby the N-terminal portion of the protein, which often represents the majority of

the polypeptide and contains the functional domain(s), faces the cytoplasm, while the C-terminal sequence (CTS) downstream of the TMD, which usually contains organelle-specific targeting information, protrudes into the organelle's interior (Borgese and Fasana, 2011).

The TA proteins are involved in a remarkable array of cellular processes, especially in plants (reviewed in Abell and Mullen, 2011). Some notable examples include the SNAREs (Soluble NSF Attachment protein REceptors), which mediate vesicular transport and fusion (Malsam et al., 2008), subunits of the ER (Osborne et al., 2005), mitochondrial, and plastidial outer membrane translocons (Jarvis et al., 1998; Gutensohn et al., 2000; Werhahn et al., 2001; Allen et al., 2002; Beilharz et al., 2003; Macasev et al., 2004), the electron carrier cytochrome *b*5(Cb5) (D'Arrigo et al., 1993; Kuroda et al., 1998; Borgese et al., 2001; Hwang et al., 2004), FIS1 (Fission 1), which is required for organelle fission (Zhang and Hu, 2008; Zhao et al., 2013), and members of the Bcl protein family that are involved in the regulation of apoptosis (Kale et al., 2012). In fact, hundreds of TA proteins have been identified in a wide range of evolutionarily diverse organisms, including humans (Kalbfleisch et al., 2007), *Saccharomyces cerevisiae* (Beilharz et al., 2003), *A. thaliana* (Kriechbaumer et al., 2009; Pedrazzini, 2009; Dhanoa et al., 2010), and bacteria (Borgese and Righi, 2010; Craney et al., 2011), with many of these proteins having unknown function. As such, there is a growing appreciation that TA proteins participate in far more cellular processes than previously envisioned. Moreover, because of their distinct structural characteristics and unusual targeting and membrane insertion pathways, considerable attention has been devoted in recent years to understanding the mechanisms underlying their biogenesis.

By far the most-studied TA proteins in terms of their biogenesis are those localized to the ER, including those that are subsequently transported to other compartments of the endomembrane system, such as the nuclear envelope, Golgi, endosomes, vacuole/lysosomes, plasma membrane, and peroxisomes (reviewed in Rabu et al., 2009; Colombo and Fasana, 2011). For instance, the targeting information responsible for the initial sorting of nascent TA proteins to the ER is well-established as being located within their C termini, including the TMD and CTS (Borgese et al., 2003). The targeting signals for ER-TA proteins are also known not to be based on specific amino acid sequences, but rather consist of general physicochemical properties that are unique to this group of proteins. Compared to mitochondrial-TA proteins, for instance, ER-TA proteins usually contain TMDs that are longer and more hydrophobic, and their CTSs are often less positively charged (Borgese et al., 2003). Indeed, recent structural studies have confirmed that these unique properties in the C termini of ER-TA proteins effectively mediate their specific recognition and insertion by the conserved GET (Guided Entry of TA proteins) and TRC40 (Transmembrane domain Recognition Complex 40) complexes in yeasts and mammals, respectively (reviewed in Denic, 2012). Homologs of the GET3/TRC40 machinery also exist in plants (Abell and Mullen, 2011; Duncan et al., 2013), but their function in ER-TA protein biogenesis has not yet been investigated. There is also the intriguing possibility that the biogenesis of ER-TA proteins in plants involves other, perhaps novel pathways, such as SRP/Sec61 acting in an unusual post-translation mode (Abell et al., 2003) or the ankyrin repeat-containing protein, AKR2A (Shen et al., 2010), which also mediates the targeting of chloroplast outer membrane-(TA) proteins (Bae et al., 2008; Dhanoa et al., 2010).

While the biogenesis of mitochondrial and plastidial-TA proteins is relatively less understood than that of ER-TA proteins, several important points have emerged. For instance, based on the few chloroplast outer membrane TA-proteins studied to date, there appear to be at least two biogenetic pathways, distinguished by the nature of their targeting signals, the membrane protein and lipid components involved (Li and Chen, 1997; Tsai et al., 1999; Qbadou et al., 2003; Dhanoa et al., 2010), and perhaps cytoplasmic components (Kriechbaumer and Abell, 2012). For mitochondrial-TA proteins, targeting to the outer mitochondrial membrane (OMM) is generally considered to be mediated by the distinct physicochemical properties of their C termini. That is, in comparison with the C termini of ER-TA proteins, mitochondrial TA regions usually consist of a shorter and moderately hydrophobic TMD and a positively-charged CTS (Borgese et al., 2003). For most mitochondrial-TA proteins, disruption of either the TMD or CTS results in mislocalization to the ER (D'Arrigo et al., 1993; Mihara, 2000; Borgese et al., 2001; Horie et al., 2002), revealing that these two organelles are linked by independent but competing pathways. There is also mounting evidence that the targeting specificity of mitochondrial-TA proteins, similar to that of plastidial-TA proteins, involves the inherent lipid composition of the membranes in which they properly reside. In yeast, for instance, the relatively low level of ergosterol in the OMM compared to the ER membrane has been shown to play a major role in targeting specificity (Krumpe et al., 2012). In mammals, both membrane lipids (Otera et al., 2007) and membrane protein machinery (Stojanovski et al., 2007) have been implicated in the biogenesis of mitochondrial-TA proteins.

Although no comparable studies have been conducted on whether membrane lipid composition and/or protein machinery serves as a determinant in properly guiding mitochondrial-TA proteins in plant cells, there are data suggesting that the targeting information in plant mitochondrial-TA proteins is more complex than that in yeast and mammalian cells. That is, targeting to the OMM in plants appears to rely on not only similar physicochemical characteristics that are conserved in yeast and mammalian mitochondrial-TA proteins (e.g., short and moderately hydrophobic TMD), but also several discrete sequencespecific features. For instance, in the mitochondrial isoform of cytochrome *b*<sup>5</sup> from tung tree (i.e., *Aleurites fordii* Hemsl Cb5D), one the most prominent of these features is a dibasic sequence motif in the protein's three-amino-acid-long CTS, whereby the first position of the motif is an arginine, the second position is an arginine, lysine, or histidine and the third position cannot be occupied by a negatively-charged residue (Hwang et al., 2004). Notably, this same dibasic motif (-R-R/K/H-X{X-=E} ) is present in the CTSs of other putative mitochondrial isoforms of plant Cb5, including *A. thaliana* Cb5 isoform 6 (Cb5-6) (Hwang et al., 2004). The motif is absent, however, in the CTSs of mitochondrial isoforms of mammalian Cb5 (there are no known mitochondrial isoforms of Cb5 in yeast), suggesting that it is an added feature in plant mitochondrial Cb5 proteins that allows them to cope with the need to discriminate between mitochondria, ER and plastids (Hwang et al., 2004; Abell and Mullen, 2011).

Here we show that the C-terminal dibasic motif (-R-R/K/H-X{X-=E} ) of tung Cb5D exists in several other *A. thaliana* OMM-TA proteins besides Cb5-6, but is absent in most of the TA protein subunits of the TOM (Translocon at the Outer membrane of Mitochondria) complex. We also describe the results of a mutational analysis of selected members from each of these two groups of mitochondrial-TA proteins, indicating that they rely on different types of targeting signals. Moreover, results from mutational analysis of the dibasic targeting signal reveals that this motif is more diverse than previously believed and that the new consensus sequence for this signal not only has promising predictive power for identifying new candidate OMM-TA proteins, but also serves as an important step toward elucidating the differential targeting signals utilized by various classes of TA proteins in plant cells.

### **MATERIALS AND METHODS**

#### **RECOMBINANT DNA PROCEDURES AND REAGENTS**

Standard recombinant DNA procedures were performed as described by Sambrook et al. (1989). Molecular biology reagents were purchased from New England BioLabs and Invitrogen. All plasmid DNA constructs were verified using automated dyeterminated cycle sequencing performed at the University of Guelph Genomics Facility. All custom oligonucleotide forward and reverse primers used for polymerase chain reaction (PCR) based cloning and site-directed mutagenesis (see "*PLASMID CONSTRUCTION*") were synthesized by Sigma-Aldrich Ltd., and the sequences of these primers are available upon request.

### **PLASMID CONSTRUCTION**

cDNAs encoding full-length open reading frames (ORFs) for the various *A. thaliana* candidate OMM-TA proteins examined in this study were obtained from the *A. thaliana* Biological Resource Center (ABRC) (Ohio State University) or RIKEN Bioresource Center and then, using PCR and the appropriate forward and reverse primers, were sub-cloned as either the entire ORF or, for green fluorescent protein (GFP) fusion proteins consisting of GFP linked to the C terminus of a TA protein, portions thereof, into one or more of the following vectors: pRLT2/Myc-MCS or pRTL2/HA-MCS, plant expression vectors that includes the 35S cauliflower mosaic virus (CMV) promoter and sequences encoding an initiation methionine, glycine linker and Myc or hemagluttinin (HA) epitope tag (Fritze and Anderson, 2000), then a multiple cloning site (MCS) (Shockey et al., 2006); pRTL2/GFP-MCS, which contain the 35S CMV promoter and an MCS immediately 5 of the GFP ORF (Shockey et al., 2006); or pSPUTK-*Bgl*II, which contains the SP6 promoter, the high-efficiency β-globin 5 untranslated region, a Kozak's initiation site for efficient translation in rabbit reticulocyte lysate, and an MCS. Complete details on the construction procedures used for generating plasmids encoding any of the various *A. thaliana* TA proteins and all modified versions thereof described in this study are available upon request. pRTL2/Myc-TOM40 encodes the 40 kDa channel-forming subunit of the *A. thaliana* TOM complex fused to an N-terminal Myc epitope tag (Hwang et al., 2008), pRTL2/BCAT3-Cherry encodes the *A. thaliana* plastidial branched-chain aminotransferase 3 fused to the monomeric Cherry fluorescent protein (Niehaus et al., 2014), and pRTL2/Cherry-PTS1 encodes the Cherry protein appended to the C-terminal type 1 peroxisomal matrix targeting signal from pumpkin hydroxypyruvate reductase (Ching et al., 2012).

#### **BY-2 CELL CULTURES, BIOLISTIC BOMBARDMENT, AND IMMUNOSTAINING OF BY-2 CELLS**

*Nicotinana tabacum* Bright Yellow-2 (BY-2) suspension cell cultures were maintained and prepared for bombardment with a biolistic particle delivery system-1000/HE (Bio-Rad Laboratories) as described previously (Lingard et al., 2008). Transient (co-)transformations were performed using 0.5–2 μg of plasmid DNA, which was determined empirically based on the relative strength of the (immuno)fluorescence signal. Bombarded cells were incubated for ∼4 h to allow for expression and sorting of the introduced gene product(s). Amounts of plasmid DNA and the ∼4 h post-bombardment time point were chosen in order to ensure that any potential negative effects due to excessively high levels of protein expression were diminished. Cells were fixed in 4% (w/v) formaldehyde, followed by permeabilization with 0.01% (w/v) pectolyase Y-23 (Kyowa Chemical Products) and either 0.3% (v/v) Triton X-100 or 25 μg mL−<sup>1</sup> digitonin (Lee et al., 1997). Primary and dye-conjugated secondary antibodies and sources were as follows: rabbit anti-Myc IgGs (Bethyl Laboratories); mouse anti-Myc antibodies in hybridoma medium (Princeton University, Monoclonal Antibody Facility); mouse anti-α-tubulin (Sigma-Aldrich Ltd.); mouse anti-maize β-ATPase antibodies in hybridoma medium (Luethy et al., 1993) (kindly provided by T. Elthon, University of Nebraska-Lincoln); rabbit anti-cytochrome c oxidase subunit II (CoxII) IgGs (Frelin et al., 2012); goat anti-mouse and goat anti-rabbit Alexa 488 IgGs (Molecular Probes); and goat anti-rabbit and goat anti-mouse rhodamine red-X IgGs (Jackson ImmunoResearch Laboratories). Concanavalin A (ConA) conjugated to Alexa 594 (Molecular Probes) was added to cells at a final concentration of 5 μg/mL during the final 20 min of incubation with secondary antibodies. In all experiments, at least 50 independently transformed cells were evaluated to determine intracellular localization(s) of the transiently-expressed protein, and each biolistic experiment was replicated at least three times.

#### *A. THALIANA* **GROWTH CONDITIONS AND TRANSFORMATION**

All plants were grown in chambers at 21◦C with a 16 h/8 h light/dark cycle. Seeds were typically surface sterilized and sown in plant nutrient (PN) media (Haughn and Somerville, 1986) or half-strength Murashige and Skoog (MS) salts (Murashige and Skoog, 1962) containing 0.5% (w/v) sucrose, and solidified with 0.6% (w/v) agar. The stable transgenic line of *A. thaliana* (Columbia-0 ecotype) co-expressing the Cherry-At1g55450 and mito-GFP fusion proteins was generated by transforming plants already expressing mito-GFP, the seeds for which were kindly provided by David Logan (INRA/AgroCampus Ouest/Université d'Angers) (Logan and Leaver, 2000) using the *Agrobacterium tumefaciens* (strain GV3101)-mediated floral dip transformation (Clough and Bent, 1998). Two independent lines coexpressing Cherry-At1g55450/mito-GFP were selected to test for Cherry/GFP fluorescence intensities, and used also for localization studies. None of the transgenic lines displayed any obvious growth or reproductive abnormalities.

#### **MICROSCOPY**

Epifluorescence microscopic images of BY-2 cells were acquired using a Axioscope 2 MOT fluorescence microscope (Carl Zeiss Inc.) equipped with a 63× Plan Apochromat objective and a Retiga 1300 CCD camera (Qimaging), plus associated Openlab software (Improvision). Confocal laser-scanning microscopy (CLSM) images of BY-2 cells and *A. thaliana* 7- to 8-day-old seedlings, which were placed on a glass slide in distilled water under a coverslip and then viewed, were acquired using a Leica DM RBE microscope with a 63× Plan Apochromat objective, TCS SP2 scanning head, and LAS AF software package. CLSM images were acquired as single optical sections of representative cells and were saved as 512 × 512-pixel digital images. All figure compositions and merged images were generated using Openlab (Improvision) or Northern Eclipse (Empix Imaging Inc.) software, and Adobe Photoshop CS (Adobe Systems). All micrographs shown in the figures are representative images obtained in experiments that were replicated at least three times.

#### **ISOLATION OF INSERTION-COMPETENT MITOCHONDRIA**

Mitochondria were purified from 14-day-old, light-grown pea (*Pisum sativum* cv. Little Marvel) seedlings by a modification of the procedure described by Fang et al. (1987). The seedlings were harvested *en mass* ∼2–3 cm above the soil line, using a large pair of scissors. All subsequent steps were on ice or at 4◦C. Plant material was rinsed with deionized water, then homogenized at 1 g fw/2.5 mL of homogenization medium using 3–10 s bursts with a Braun blender modified to hold single-edged razor blades. Homogenization medium was 40 mM MOPS-KOH, pH 7.2, containing 600 mM mannitol, 10 mM EDTA, 8 mM cysteine (free base), and 0.4% defatted BSA. The homogenates were filtered through 8 layers of cheesecloth plus 2 layers of Miracloth (Fisher Scientific). The filtrates were centrifuged at 3300 × g for 5 min, and the 3.3 k pellets were discarded. The supernatants were centrifuged at 18,000 × g for 20 min, and the supernatants discarded. The 18 k pellets were resuspended in 35 mL of 5 mM MOPS-KOH, pH 7.2, containing 250 mM mannitol, and 0.1% defatted BSA, first by using a horse-hair paint brush, then by 3 passes in a loose-fitting glass and Teflon Potter-Elvehjem homogenizer. Eight mL of the resuspended mitochondria-enriched fraction was layered on top of a discontinuous Percoll step gradient comprised of 6 mL 21% (v/v), 12 mL 26%, 10 mL 47%, all in 5 mM MOPS-KOH, pH 7.2, containing 250 mM mannitol. The Percoll gradients were centrifuged at 65,000 × g for 45 min using a Beckman SW-28 rotor. The mitochondria, which band at the 26/47% Percoll interface, were collected, diluted with 10 volumes of 5 mM MOPS-KOH, pH 7.2, containing 250 mM mannitol and 10 mM DTT, and centrifuged at 18,000 × g for 15 min. The 18 k pellet was resuspended in MOPS/mannitol/DTT and re-pelleted. The final mitochondria-enriched pellets were resuspended in a small volume of MOPS/mannitol/DTT, and kept on ice prior to import experiments.

#### **PLASMID-DRIVEN** *IN VITRO* **TRANSCRIPTION/TRANLSATION/ INSERTION**

The *in vitro* transcription and translation reactions were conducted while the mitochondria were being prepared. Unless otherwise noted, all reagents were from Sigma Chemical Co. Purified, linearized plasmids (pSPUTK/Myc-At1g55450 and pSPUTK/Myc-TraB) were used for T7-driven coupled transcription-translation in a rabbit reticulocyte lysate (Promega) plus EasyTag L-[35S]-Met (Perkin Elmer) according to the manufacturer's instructions (Promega). Translations were terminated after 90 min at 30◦C by adding 100μg emetine, and ribosomes were removed by centrifugation at 150,000 × g for 15 min at 4◦C using a Beckman Model TLA 100.2 rotor. Unincorporated [35S]-Met was removed from the supernatant using Centri-spin 10 desalting columns (Princeton Separations).

Import/integration reactions were conducted in 200 μL total volume, contained 100μg mitochondrial protein, and typically 1–5% (v/v) of the translation reaction. Mitochondria were diluted into 25◦C integration buffer and pre-incubated for 5 min prior to adding the 35S-labeled translation products. Integration buffer consisted of 300 mM sorbitol, 10 mM HEPES-KOH, pH 7.2, 0.1% defatted-BSA, 80 mM KCl, 10 mM MgOAc, 2 mM KH2PO4, and 1 mM MgCl2. Reactions using energized mitochondria additionally contained 2 mM L-malate, 4 mM NADH, 2 mM ATP, pH 7.0, 60 mM phospho-creatine, and 0.15 mg/mL creatine kinase. Integration reactions were incubated at 25◦C for 20–25 min. For protein import, reaction mixtures were then transferred to ice and incubated for 20 min with 10μg/mL Proteinase K. To stop proteolysis, PMSF was added to 2 mM. Membrane integration was determined essentially as described (Fujiki et al., 1982). Following a wash with 100 mM Na2CO3, pH 11.5 (ASC), membranes were pelleted by centrifugation at 150,000 × g for 30 min at 4◦C using a Beckman Model TLA 100.2 rotor. Pellets were washed one time by resuspending in ASC and re-pelleting. 35S-translation products were acid precipitated and washed by resuspensing then re-pelleting. All samples were analyzed by SDS-PAGE plus phosphor-imaging. Equal volumes of sample and 8 M urea, 4% SDS, and 4% 2-mercaptoethanol were combined, heated to 60◦C for 5 min, and clarified by centrifugation at maximum speed in a micro-centrifuge for 1 min. Proteins were separated after application to pre-cast Novex NuPAGE 12% Bis-Tris gels using the MES/SDS buffer (Life Technologies). Electrophoresis was stopped when the dye-front reached the bottom of the gels. After drying, gels were wrapped in cellophane, placed onto a K-type phosphorimaging screen, and analyzed for 16–18 h at room temp using a Bio-Rad PMI system.

### **BIOINFORMATICS ANALYSIS**

A list of all putative TA proteins in *A. thaliana* was compiled from previously published TA protein datasets (Kriechbaumer et al., 2009; Pedrazzini, 2009), plus those identified using the "TAMP (TA Membrane Protein) Finder" program (Dhanoa et al., 2010; Craney et al., 2011). Additional information on the TAMP Finder program and the putative *A. thaliana* TA proteins identified using this program will be published elsewhere. Known and putative *A. thaliana* OMM-TA proteins that contain an expanded dibasic motif in their CTS, as defined in this study, e.g., -R/K/H-X{0,1-=E} -R/K/H{-=−H−H−or−H−X−H−}- X{0,1-=E} X{0,3}{CTS=3,8} (**Table 2**), were identified initially by visual inspection of the Kriechbaumer et al. (2009), Pedrazzini (2009), and TAMP datasets. However, since there is no established computational method for precisely defining the ends of a TMD and thus, the length of the CTS, all proteins containing the motif within a CTS predicted to be 2 to 10 amino acids in length were considered candidates. The TMD predictions were performed using TMPred (Hofmann and Stoffel, 1993), ARAMEMNON (Schwacke et al., 2003), and/or TOPCONS (Bernsel et al., 2009). Candidate proteins were analyzed using the TargetP 1.1 Server (http://www.cbs.dtu.dk/services/TargetP/) (Emanuelsson et al., 2000) and PEPscreen® Calculator (Sigma-Aldrich Ltd; according to Monera et al., 1995) or Grand Average of Hydropathy (GRAVY) Calculator (http://www.gravy-calculator. de/) and those proteins predicted to contain an N-terminal targeting signal (based on a >0.51 cutoff for TargetP) for mitochondria, chloroplasts or the ER (secretory pathway) and/or a predicted TMD with a relatively high hydropathy score (based on a >1.2 cutoff for PEPscreen®) were excluded. Thereafter, duplicated proteins due to splice variants were removed. All deduced amino acid sequences were obtained from GenBank and/or The *Arabidopsis* Information Resource (TAIR). Predicted intracellular localizations were taken from SUBA3 (The SUBcellular localization database for *A. thaliana* proteins) (Tanz et al., 2013) or the Gene Ontology database (Ashburner et al., 2000), or were experimentally determined here.

### **RESULTS AND DICUSSION**

#### **IDENTIFICATION OF TWO MAJOR GROUPS OF OMM-TA PROTEINS IN** *A. THALIANA*

To begin to further analyze the targeting signals in mitochondrial-TA proteins in plants, we first compiled a list of *bona fide*

#### **Table 1 |** *Arabidopsis* **OMM TA proteinsa.**

OMM proteins that are also considered to have a TA-orientation. Specifically, we cross-referenced datasets of authentic *A. thaliana* mitochondrial membrane proteins previously identified in various proteomics screens (Duncan et al., 2011; Klodmann et al., 2011; reviewed in Duncan et al., 2013) with all of the TA proteins predicted for the *A. thaliana* deduced proteome (Kriechbaumer et al., 2009; Pedrazzini, 2009; Dhanoa et al., 2010). As shown in **Table 1**, a total of 20 candidate mitochondrial TA-proteins were identified, including 10 subunits of the TOM complex, as well as FIS1A/B and PMD1/2 (Peroxisomal and Mitochondrial Division factor 1 and 2), which serve together as regulators of mitochondrial fission (Scott et al., 2006; Zhang and Hu, 2009; Aung and Hu, 2011), and MIRO1/2 (MItochondrial RHO GTPases 1 and 2), which regulate mitochondrial morphology and motility (Yamaoka and Leaver, 2008). Also identified were isoforms of ascorbate peroxidase (APX-5) (Caverzan et al., 2012) and purple acid phosphatase (PAP2) (Sun et al., 2012), as well as a member of the TraB protein family (At1g05270), which, based on the role of its homologs in bacteria, is thought to play a role in signaling in plants (Duncan et al., 2011). Also listed in **Table 1**, as expected,


*aAll of the proteins listed are bona fide Arabidopsis OMM proteins, based on (Duncan et al., 2011, 2013) and Klodmann et al. (2011), and are also known or predicted to possess a TA topology, based on their identification in various proteomics- and bioinformatics-based searches in Arabidopsis (Kriechbaumer et al., 2009; Pedrazzini, 2009; Dhanoa et al., 2010). The proteins are grouped by TOM proteins and all other proteins, and then listed alphabetically. See text for additional details.*

*bCommon nomenclature of Arabidopsis mitochondrial outer membrane TA proteins based on published literature and The Arabidopsis Information Resource (TAIR) (http:// www*.*arabidopsis*.*org). Proteins indicated with an asterisk were experimentally characterized in terms of their intracellular localization and TA topology; see Figures 1, 2, and text for additional details.*

*cAbbreviations are: APX-5, ascorbate peroxidase isoform 5; Cb5-6, cytochrome b*<sup>5</sup> *isoform 6; FIS1, fission 1; MIRO1/2, mitochondrial RHO GTPase 1 and 2; PAP2, purple acid phosphatase 2; PMD1/2, peroxisomal and mitochondrial division factor 1 and 2; TOM, translocase of the mitochondrial outer membrane (subunits 5, 6, 7, 9, and 20 and isoforms thereof); TraB, domain motif based on Enterococcus faecalis traB (An and Clewell, 1994).*

*dArabidopsis gene identifier (AGI) number represents the systematic designation given to each locus, gene, and its corresponding protein product(s) by TAIR. eShown for each protein is its deduced C-terminal tail sequence, including its putative TMD (underlined), based on the TMD prediction program TOPCONS and visual inspection, and its downstream CTS. Shaded are the dibasic amino acid residues in the dibasic targeting signal motif, -R-R/K/H-X{ X*-<sup>=</sup>*E} (Hwang et al., 2004).*

is Cb5-6, which, as already mentioned, is one of six isoforms of Cb5 in *A. thaliana* (Maggio et al., 2007; Paquette et al., 2009) that was experimentally determined to possess a similar OMM dibasic targeting signal motif as that found in tung mitochondrial Cb5D (Hwang et al., 2004).

Inspection of the CTSs presented in **Table 1** shows that while all of the proteins possess a relatively short (<21 residues) and moderately hydrophobic TMD, there is a clear separation of the proteins into two groups (*p* = 0.009, hypergeometric test) based on their CTSs; (i) the TOM proteins, which, with the exception of TOM20-1, generally have longer CTSs that lack a dibasic motif (i.e., -R-R/K/H-X{X-<sup>=</sup>E}), and (ii) all other proteins, the majority of which have relatively shorter CTSs and contain a dibasic motif. Notably, several of the proteins in the dibasic group, including Cb5-6, FIS1A/B, PAP2, and PMD2 have been reported to target not only to mitochondria, but also to other organelles (i.e., peroxisomes or chloroplasts) in certain plant cell types and/or experimental conditions (Scott et al., 2006; Maggio et al., 2007; Lingard et al., 2008; Zhang and Hu, 2009; Aung and Hu, 2011; Sun et al., 2012; Ruberti et al., 2014). Two of these proteins contain what appear to be more divergent CTSs than the other proteins in this group, including FIS1B, which contains only a single basic amino acid in its CTS (i.e., -LRS), and PAP2, which has a relatively long CTS (i.e., 20 amino acids) and also contains several basic residues, including a dilysine sequence, and acidic residues. Perhaps some of these sequence differences in the CTSs, as well as others in the TMD and/or additional factors, contribute to their alternative subcellular localizations. For instance, binding of another protein, including a chaperone and/or receptor that itself might vary depending on cell type and/or environmental condition, burying/exposing of the targeting elements by protein folding or, as suggested previously for Cb5 (Maggio et al., 2007), post-translational modification (e.g., phosphorylation), may be also involved in the differential targeting of these proteins. Clearly, further work is needed to better understand how these OMM-TA proteins discriminate between different organelles in plant cells.

Taken together, these results indicate that many OMM-TA proteins in *A. thaliana* contain a dibasic motif similar to that known to be important for the targeting of mitochondrial Cb5 from tung (Hwang et al., 2004). Furthermore, the absence of this dibasic motif in the CTSs of most of the TOM-TA proteins suggests that these proteins rely on a different type of targeting signal for their localization to the OMM.

#### **LOCALIZATION AND TOPOLOGY OF OMM TA PROTEINS IN TOBACCO BY-2 CELLS**

To analyze the targeting information present in the two groups of proteins presented in **Table 1**, we first confirmed the intracellular localization of a subset of proteins from each group. Plasmid DNAs encoding the respective proteins were transiently expressed and visualized in tobacco BY-2 suspension-cultured cells, which are commonly used as model system for studying protein localization and targeting (Brandizzi et al., 2003; Miao and Jiang, 2007; Denecke et al., 2012). As shown in **Figures 1A,B**, CLSM images of BY-2 cells transiently-expressing N-terminal Mycepitope-tagged members from both groups of proteins revealed that all of the proteins localized to mitochondria. That is, all

**FIGURE 1 | Subcellular localization of selected** *A. thaliana* **OMM-TA proteins in BY-2 cells.** BY-2 cells were transiently transformed with plasmid DNA expressing selected Myc-tagged TOM-TA proteins **(A)** or dibasic-motif-containing TA proteins **(B)** and immunostained for endogenous mitochondrial E1β or CoxII, as indicated in the panel labels. Alternatively, in **(C)**, cells were (co-)transformed with Myc-tagged Cb5-6 and the plastid marker protein BCAT3-Cherry or the peroxisome marker protein (PTS1), which also includes the Cherry protein, or incubated with fluor-conjugated concanavalin A (ConA), serving as an ER marker stain (Tartakoff and Vassalli, 1983). Processing of cells for immunofluorescence microscopy and viewing using CLSM are as described in the "Materials and Methods." Shown in the three panels on the right side of each row in **(A)** and **(B)** are images corresponding to a portion of the cell at higher magnification. Solid arrowheads indicate examples of the torus-shaped fluorescent structures containing the Myc-tagged TA protein delineating the *(Continued)*

#### **FIGURE 1 | Continued**

spherical structures attributable to mitochondrial E1β or CoxII. The box in **(C)** represents the portion of the Cb5-6 and BCAT3-Cherry co-transformed cells shown at higher magnification in the panel to the right. Note also in **(C)** that only merged images of a Cb5-6 and PTS1 co-transformed cell or a Cb5-6-transformed cell stained with ConA are shown. Bar in **(A)** = 10μm.

of the proteins, with the exception of MIRO1, displayed a torus or ring-shaped immunofluorescence pattern that encircled the punctate immunofluorescence pattern attributable to the endogenous mitochondrial pyruvate dehydrogenase E1β (Luethy et al., 1993) or CoxII (Millar et al., 2011) (**Figures 1A,B**). Transientlyexpressed MIRO1 also targeted to mitochondria, but appeared to alter the morphology of the organelle (**Figure 1B**), similar to what is observed for its mammalian and yeast protein counterparts (Fransson et al., 2003; Frederick et al., 2004). Similar localization results were reported for several of the same proteins when they were transiently expressed as GFP-tagged proteins in *A. thaliana* suspension cells (Duncan et al., 2011), indicating that the appended tag (Myc or GFP) does not influence intracellular localization. Notably, we also demonstrated that the mitochondrial localization of the proteins examined in this study in BY-2 cells was readily distinguishable from various other organelles, namely plastids (i.e., leucoplasts), peroxisomes, and the ER [results presented for Cb5-6 (**Figure 1C**)].

We next determined whether the selected proteins from **Table 1** adopt a TA orientation in mitochondria, e.g., Ncytoplasm-Cinner membrane space, by performing differential detergent permeabilization experiments. Permeabilization of BY-2 cells with Triton X-100 solubilizes all cellular membranes, and thus applied antibodies can access epitopes present on proteins inside organelles (e.g., mitochondrial matrix pyruvate dehydrogenase E1β) or in the cytoplasm (e.g., α-tubulin) (**Figure 2A**). Permeabilization of BY-2 cells with digitonin, however, results in solubilization of only the plasma membrane, and thus applied antibodies can access only epitopes present in the cytoplasm (**Figure 2A**). As shown in **Figure 2B**, differential permeabilization of BY-2 cells transiently-expressing N-terminal-Myc-tagged TOM9-2 (Myc-TOM9-2), which is known to be oriented with its N terminus in the cytoplasm (Macasev et al., 2004; Hwang et al., 2008; Carrie et al., 2010) resulted in immunodetection of the protein in cells permeabilized with either Triton X-100 or digitonin, confirming the expected orientation of the protein's N terminus. By contrast, differential permeabilization of BY-2 cells expressing Myc-TOM40, which is a pore-forming subunit of the TOM complex (Macasev et al., 2004; Carrie et al., 2010) and known to be orientated in the OMM such that both its N and C termini face the inner membrane space (Hwang et al., 2008), prevented immunodetection of the protein in BY-2 cells that were permeablized with digitonin, as also expected (**Figure 2B**).

Shown in **Figure 2C** are the results of differential permeabilization experiments indicating that, like Myc-TOM9-2, all of the selected Myc-tagged proteins described in **Figure 1** were immunodetected in digitonin-permeabilized BY-2 cells (**Figure 2C**), indicating that they are also orientated in the OMM such that their N termini are exposed to the cytoplasm. We did not

test, however, any C-terminal-epitope-tagged-versions of these proteins (or Myc-TOM9-2), since addition of an epitope tag or fluorescent protein to the C terminus of a TA protein, particularly mitochondrial-TA proteins, usually disrupts proper targeting (Borgese et al., 2003; Borgese and Fasana, 2011). Indeed,

indicating that the expressed protein, unlike Myc-TraB (**Figure 1**) is not

properly targeted to mitochondria. Bar in **(A,C)** = 10μm.

consistent with the mislocalization of the mitochondrial isoform of mammalian Cb5 with a C-terminal-appended epitope containing an N-glycosylation sequence (Maggio et al., 2007), we observed that Myc-tagged TraB with an added C-terminalappended HA epitope tag (Myc-TraB-HA) did not properly target to mitochondria in BY-2 cells (**Figure 2D**). Nonetheless, the data presented **Figures 1**, **2** indicate that (i) all of the selected proteins from **Table 1** target to the OMM when transiently expressed in tobacco BY-2 cells, (ii) these proteins likely adopt the expected TA topology, and (iii) tobacco BY-2 cells can serve as a useful model system to further characterize the targeting signals involved in mitochondrial localization.

#### **THE TA REGIONS OF DIBASIC MOTIF-CONTAINING PROTEINS ARE BOTH NECESSARY AND SUFFICIENT FOR TARGETING TO MITOCHONDRIA, WHILE THE TA REGIONS OF TOM TA PROTEINS ARE NECESSARY, BUT NOT SUFFICIENT FOR TARGETING**

To begin to characterize the targeting information in the two groups of OMM-TA proteins, we first tested whether their C-terminal TA regions, which include both the TMD and CTS, were necessary and/or sufficient for proper localization. Toward this end, we focused first on the group of proteins that contains the dibasic motif, or a divergent version thereof. As shown in **Table 1**, the CTS of TraB (i.e., -SRRK) closely matches the dibasic motif defined previously for mitochondrial Cb5, that is, -R-R/K/H-X{X-=E} . Similar to previous studies of Cb5 (Hwang et al., 2004), the C-terminal TA region of TraB was both necessary and sufficient for localization to mitochondria. As shown in **Figure 3A**, a mutant version of Myc-tagged TraB lacking its C-terminal 24 amino acid TA region (i.e., Myc-TraB-C24), was localized to the cytoplasm in BY-2 cells and not to CoxIIcontaining mitochondria, indicating that the TA region is necessary for its proper targeting. Addition of the TraB 24-amino-acidlong TA region to the C terminus of GFP (GFP-TraB+C24), on the other hand, resulted in mitochondrial localization (**Figure 3A**), indicating that the region is sufficient for targeting. Similarly, the TA sequences of Cb5-6, MIRO1, PMD2, as well as PMD1, the latter of which contains a more divergent dibasic motif in its CTS that harbors a single amino acid insertion between two basic residues (i.e., -SKLR), were all necessary and sufficient for mitochondrial localization (**Figure 3A**). Given the differences in the TMDs and CTSs of these proteins (**Table 1**), it appears that a certain degree of sequence divergence, including positioning of the dibasic motif relative to the end of the TMD and/or C terminus, the length of the CTS, and, with respect to PMD1, the contiguous nature of the dibasic sequence itself, can be tolerated while maintaining proper localization to mitochondria.

In contrast to the mitochondrial-TA proteins that contain a dibasic motif in their CTS, the TA regions of TOM proteins were necessary, but not sufficient for mitochondrial localization (**Figure 3B**). That is, with one exception, all of the TOM-TA proteins were mislocalized to the cytoplasm [or did not express, possibly due to instability/degradation of the truncated protein (i.e., TOM9-1-C61)] when their TMD and CTS were removed, while the addition of these sequences to the C terminus of GFP resulted in localization of the fusion proteins to either to the cytoplasm or, in the case of GFP-TOM20-4+C37, to the ER (**Figure 3B**). The one exception was TOM20-1, where the TA region was both necessary and sufficient for mitochondrial targeting (**Figure 3B**). Interestingly, TOM20-1 is distinct from the other TOM TA proteins in that its CTS is relatively short and includes a dibasic sequence similar to the other group of proteins (**Table 1**). Overall, these data indicate that the targeting information in the two groups of mitochondrial-TA proteins is essentially different, and that for most of the TOM-TA proteins, sequences in addition to their TA regions are required for proper targeting to mitochondria.

### **THE CTSs OF BOTH GROUPS OF MITOCHONDRIAL-TA PROTEINS ARE NECESSARY FOR TARGETING, BUT ONLY THOSE CONTAINING A DIBASIC MOTIF ARE INTERCHANGEABLE**

The CTSs of most TA proteins contain information that, along with their TMD, is important for ensuring that they are targeted to the proper intracellular membrane (Borgese and Fasana, 2011; Abell and Mullen, 2011). As such, removal of the CTS from the TA region or swapping the CTS with the CTS of another TA protein that localizes to a different organelle usually results in mislocalization. On the other hand, swapping of the CTSs of TA proteins that localize to the same organelle sometimes preserves targeting, at least in those cases where the proteins contain a conserved targeting signal(s).

To test whether the CTSs of mitochondrial-TA proteins that possess a dibasic motif are interchangeable, a series of truncation and chimeric mutants were generated and their localization was assessed in transiently-transformed BY-2 cells as either mitochondrial or not mitochondrial based on comparison to endogenous CoxII immunostaining in the same cells (**Figure 4A**). Deletion of the CTS from either Cb5-6 (Cb5-6- SRKT) or TraB (TraB-SRRK) disrupted their localization to mitochondria, indicating that the CTSs of these two proteins, similar to tung Cb5D (Hwang et al., 2004), contain essential mitochondrial targeting information. On the other hand, chimeric versions of Cb5-6 and TraB, whereby their CTSs were swapped, i.e., Cb5-6-TraBCTS and TraB-Cb5-6CTS, localized to mitochondria (**Figure 4A**). Similarly, the targeting of Cb5-6 to mitochondria was preserved when its CTS was replaced with the CTS from either PMD2, APX-5, or MIRO1 (**Figure 4A**), revealing that a CTS containing an acidic amino acid (i.e., glutamic acid in APX-5 [-EASRRGK]), as well as a longer CTS (i.e., 8 amino acids in MIRO1 [-ATRKSSSA]), is acceptable for targeting to mitochondria in a chimeric context (**Figure 4A**). Likewise, replacement of the longer CTS of MIRO1 with the shorter CTS of Cb5-6 (MIRO1-Cb5-6CTS) preserved mitochondrial localization (**Figure 4A**). As shown also in **Figure 4A**, deletion of the CTS from PMD1, which contains a more divergent dibasic motif harboring a single amino acid insertion between two basic residues (i.e., -SKLR), disrupted its mitochondrial localization, yet chimeric versions of PMD1 and either Cb5-6 (PMD1-Cb5- 6CTS, Cb5-6-PMD1CTS) or TraB (TraB-PMD1CTS) localized to mitochondria (**Figure 4A**). Collectively, these results indicate that not only are the CTSs of TA proteins that include a dibasic motif functionally interchangeable in terms of mitochondrial targeting, but that the dibasic motif can be positioned within CTSs of


**FIGURE 3 | Localization of various C-terminal mutant and GFP fusions of selected mitochondrial-TA proteins in BY-2 cells.** Shown on the left in both **(A)** and **(B)** are schematic illustrations of various C-terminal-mutant (i.e., truncated) versions or GFP fusions of various dibasic-motif-containing TA proteins **(A)** or TOM-TA proteins **(B)** and their corresponding intracellular localization in transformed BY-2 cells. The numbers in the name of each construct denote the number of residues that were either deleted from the C terminus of the Myc-tagged wild-type TA protein or fused to the C terminus of GFP, and the numbers above each illustration correspond to the N- and C-terminal amino acid residues of the TA protein. Portions of the TA protein are represented in the illustrations by white and black boxes, the latter denoting the putative TMD; green boxes denote GFP. Cyt, cytoplasm; DNE, did not express; ER, endoplasmic reticulum; mito, mitochondria. Shown on the right in both **(A)** and **(B)** are representative immuno-epifluorescence micrographs illustrating the

localization of the various constructs shown on the left. Each micrograph is labeled with the name of either the transiently-expressed Myc-tagged C-terminal mutant or GFP fusion protein, the endogenous mitochondrial marker protein, CoxII, or ConA. Boxes in the top row of **(A)** represent the portions of cells shown at higher magnification in the panels to the right. Arrowheads indicate examples of the torus-shaped fluorescent structures containing GFP-TraB+C24 delineating the spherical structures attributable to matrix-localized CoxII, indicating that GFP-TraB+C24 localizes to the OMM. For all other expressed proteins, only general (i.e., lower magnification) fluorescence patterns were compared with those of mitochondrial CoxII or, in the case of GFP-TOM20-4+C37, ConA-stained ER. Note also that cells transformed with Myc-TOM9-1-C61, which did not display a detectable immunofluorescence signal, were identified based on the fluorescence attributable to co-expressed β-ATPase-GFP, serving as cell transformation and mitochondrial matrix marker protein. Bar in **(A)** = 10μm.

various lengths. Moreover, these results expand the previous definition of the dibasic motif as -R-R/K/H-X{X-<sup>=</sup>E} (Hwang et al., 2004), revealing that a dibasic-containing CTS can tolerate a negatively-charged residue, albeit if located further upstream, and that the dibasic amino acids do not have to be contiguous.

We also examined a similar series of CTS truncation and chimeric TOM-TA proteins. As shown in **Figure 4B**, deletion of the CTS from TOM20-1, which, unlike the other TOM-TA proteins, contains a dibasic motif in its CTS (i.e., -RKLR), resulted in mislocalization. However, addition of the Cb5-6 CTS to this TOM20-1 mutant (TOM20-1-Cb5-6CTS) restored mitochondrial localization (**Figure 4B**). Similarly, a chimera consisting of Cb5-6 with its CTS replaced with the CTS of TOM20-1 (Cb5-6-TOM20-1CTS) localized to mitochondria


**FIGURE 4 | Localization of various CTS mutants and hybrid versions of selected mitochondrial-TA proteins in BY-2 cells.** Shown on the left in both **(A)** and **(B)** are schematic illustrations of various CTS mutant (truncated) or hybrid versions of selected dibasic-motif-containing TA proteins **(A)** and/or TOM-TA proteins **(B)** and their corresponding localization (or lack thereof) to mitochondria in transformed BY-2 cells. The names of the mutant and hybrid constructs represent either the specific amino acids in the CTS that were deleted from the protein or replaced with the CTS from another protein. All constructs possess an N-terminal-appended Myc-epitope tag. Shown for

(**Figure 4B**), indicating that the TOM20-1 CTS is functionally interchangeable with the CTSs from other (non-TOM) TA proteins that possess a dibasic motif. The CTSs of other TOM-TA proteins, however, were not interchangeable with the CTS of Cb5-6. For instance, deletion of the 28 aminoacid-long CTS of TOM9-1 resulted in the modified protein (TOM9-1-X25-RGL) being mislocalized, which was still the case when the Cb5-6 CTS was appended to this mutant (TOM9- 1-Cb5-6CTS) (**Figure 4B**). Furthermore, the longer CTS of TOM9-1 could not functionally replace the shorter, dibasicmotif-containing CTS of Cb5-6, i.e., Cb5-6-TOM9-1CTS was each construct is the corresponding C-terminal amino acid sequence, including putative TMD (underlined) and modified CTS (bolded), or lack thereof. Mitochondrial localization (indicated as "Yes" or "No") was assessed based on colocalization (or lack thereof) of the expressed protein and the endogenous mitochondrial CoxII. Shown on the right in both **(A)** and **(B)** are representative immuno-epifluorescence micrographs illustrating the localization of the various constructs shown on the left. Each micrograph is labeled with the name of either the expressed Myc-tagged CTS mutant or hybrid protein or endogenous CoxII. Bar in **(A)** = 10μm.

not localized to mitochondria (**Figure 4B**). Similarly, deletion of the CTS from TOM20-4 (i.e., TOM20-4-SQTPVSR), which has a significantly shorter CTS than TOM9-1 [i.e., 10 vs. 28 amino acid residues (**Table 1**)], abolished mitochondrial targeting. Moreover, the CTS of TOM20-4 CTS and Cb5-6 could not be exchanged, since both TOM20-4-Cb5-6CTS and Cb5- 6-TOM20-4CTS were mislocalized (**Figure 4B**). In fact, even replacement of the TOM20-4 CTS with the similar CTS of TOM20-3 (**Table 1**) resulted in the chimeric protein (TOM20- 4-TOM203CTS) being mislocalized (**Figure 4B**), revealing that the CTSs of even two closely related TOM proteins (isoforms) could not be exchanged without disrupting proper targeting to mitochondria.

Collectively, the results for the TOM proteins indicate that while their CTSs are necessary for targeting, they likely require other unique sequences within the context of the TMD and/or elsewhere in the native protein to function properly. This conclusion is perhaps not surprising given that the targeting of at least some TOM-TA proteins in other organisms has been shown to rely on unique sequences upstream of their CTSs, including for *S. cerevisiae* TOM22 (Rodriguez-Cousino et al., 1998; Egan et al., 1999). Moreover, the C-terminal targeting information in *S. cerevisiae* TOM5 is not conserved in other TOM-TA proteins, since the TOM5 and TOM6 TMDs are not functionally interchangeable (Horie et al., 2003). The involvement of sequences upstream of the TA region is not unique to the targeting of TOM-TA proteins, since both the TA protein subunits of the translocon at the chloroplast outer membrane (i.e., TOC33 and TOC34) appear to rely on almost the entire protein for proper targeting (Chen and Schnell, 1997; Horie et al., 2003). One possible explanation for this is that the TA-protein subunits of the translocon of mitochondria and chloroplasts have more complex requirements in terms of their biogenesis: a multi-step process that begins with their targeting to the surface of the proper organelle, followed by their insertion into the lipid bilayer, and eventually their assembly into a functional multi-protein complex. Furthermore, all of these steps are considered to be dependent primarily on the proteins themselves, given their roles as receptor proteins. As such, it is reasonable that sequences in addition to those in their TA regions have a role(s) in the targeting of TOM-TA proteins, as well as performing other distinct functions such as insertion, assembly, stabilization, and turnover (Habib et al., 2003).

#### **MUTATIONAL ANALYSIS OF THE DIBASIC MOTIF**

Given that the targeting information in the TOM proteins is likely to be more complex, and not necessarily conserved between members of the TOM family (**Figure 4B**), we chose to focus on gaining a better understanding of the nature of the dibasic targeting motif in the second group of proteins. Toward this end, we carried out a mutational analysis of the dibasic motif and other adjacent sequences in the CTS. We focused initially on using the TraB protein as a template, since this protein has not been extensively characterized to date.

The TraB CTS is -SRRK (**Table 1**), and we first examined the importance of the length and positioning of the charged residues within the CTS relative to the TMD and C terminus. As shown in **Figure 5**, while singular deletions of the C-terminal lysine (TraB-SRR) or the serine at the -4 position (TraB-RRK) had no apparent effect on the mitochondrial localization, deletion of the last two amino acids in the CTS, leaving just -SR-, resulted in the modified protein (TraB-SR) being mislocalized. Similarly, amino acid deletions leaving just -RR- (TraB-RR) or -RK- (TraB-RK) also resulted in mislocalization (**Figure 5**). These results indicate that a dibasic sequence alone is not sufficient for mitochondrial targeting of TraB, which is consistent with the previously published results for tung Cb5D (Hwang et al., 2004), whereby the dibasic sequence must be positioned within a CTS at least three amino acids in length.


#### **FIGURE 5 | Localization of various CTS mutant versions of the mitochondrial-TA protein TraB in BY-2 cells.** Shown on the **left** are schematic illustrations of wild-type and various CTS mutant versions of TraB and their corresponding localization (or lack thereof) to mitochondria in transformed BY-2 cells. The names of the mutant constructs represent the specific amino acids in their modified CTSs. All constructs possess an N-terminal-appended

Myc-epitope tag. Shown for each construct is the corresponding C-terminal amino acid sequence, including putative TMD (underlined) and modified (or wild-type) CTS; additional amino acid residues inserted into the TraB CTS (i.e., threonines) are bolded. Mitochondrial localization (indicated as "Yes" or "No") was assessed based on colocalization (or lack thereof) of the expressed protein and endogenous mitochondrial CoxII. Shown on the **right** in both are representative immuno-epifluorescence micrographs illustrating the localization of the various constructs shown on the **left**. Each micrograph is labeled with the name of the expressed Myc-tagged wild-type TraB or CTS mutant version of TraB, or endogenous CoxII. Bar = 10μm.

To determine how far the dibasic motif could be positioned relative to the TMD or C terminus of TraB, threonine residues were added either before or after the CTS. As shown in **Figure 5**, mitochondrial targeting of TraB was preserved when three or four threonines were added to the end of its CTS (TraB-SRRKTx3 and TraB-SRRKTx4), but the addition of five threonines (TraB-SRRKTx5) resulted in mislocalization. Similarly, the insertion of three threonines between the predicted border of the TMD and the CTS of TraB (TraB-Tx4SRKK) maintained mitochondrial targeting, but not when five threonines were inserted at this position (TraB-Tx5SRKK) (**Figure 5**). Together, these results and those presented previously on the impact of added threonine residues before or after the CTS in tung Cb5D (Hwang et al., 2004) reveal that the dibasic motif can tolerate being positioned at least four amino acids from either the TMD or C terminus. Our results also reveal that while a certain degree of sequence divergence exists in the CTSs of the mitochondrial-TA proteins that possess a dibasic motif, most of these proteins contain a serine residue between the predicted end of the TMD and the dibasic sequence (see **Table 1**), suggesting that this residue might be important for conveying the proper functional context for the dibasic targeting signal. Indeed, several other discrete targeting signal motifs appear to rely on adjacent so-called "accessory" or "secondary" amino acid residues for efficient function, including nuclear localization signals (Dussert et al., 2013), ER membrane retrieval motifs (Gidda et al., 2009), and peroxisomal matrix targeting signals (Kunze et al., 2011; Lingner et al., 2011).

We next investigated the identity of the basic residues in the dibasic sequence itself, and our template for these experiments was the chimera TraB-Cb5-6CTS, which as described above (**Figure 4**), consists of the TraB protein with its CTS replaced with the CTS of Cb5-6 (-SRKT). We chose this sequence since the CTS has just two basic amino acids. As shown in **Figure 6**, this -RK- dibasic sequence could be replaced by a variety of other combinations of basic amino acids without disrupting mitochondrial targeting. That is, arginine, lysine, and histidine were all acceptable at either the first or second position of the dibasic sequence, as long as it was paired with either another arginine or lysine residue. Only a dibasic sequence consisting of two histidine residues (i.e., TraB-SHHT) abolished mitochondrial targeting (**Figure 6**).

We tested also whether the dibasic sequence in TraB-Cb5- 6CTS could tolerate an amino acid insertion between the two basic residues, similar to the CTS of PMD1 (-SKLR). As shown in **Figure 6**, insertion of a single threonine residue between the dibasic resides of TraB-Cb5-6CTS (TraB-SRTKT) had no effect on localization to mitochondria, whereas the insertion of two threonine residues (TraB-SRTTKT) disrupted mitochondrial targeting. In line with these latter results, insertion of a threonine residue alongside the intervening leucine residue in the CTS of PMD1 (TraB-SKLTR) also disrupted mitochondrial targeting (**Figure 6**). These data reinforce the notion that the dibasic motif does not have to be contiguous, and that the motif can tolerate one, but not two, amino acid residues in between the two basic residues.


**FIGURE 6 | Localization of various CTS mutant versions of the mitochondrial-TA hybrid protein TraB***-***Cb5-6CTS in BY-2 cells.** Shown on the **left** are schematic illustrations of wild-type and various CTS mutant versions of the hybrid protein TraB-Cb5-6CTS and their corresponding localization (or lack thereof) to mitochondria in transformed BY-2 cells. The names of the mutant constructs represent the specific amino acids in their CTSs. All constructs possess an N-terminal-appended Myc-epitope tag. Shown for each construct is the corresponding C-terminal amino acid sequence, including putative TraB TMD (underlined) and modified (or

wild-type) Cb5-6 CTS; modified or additional amino acid residues inserted into the Cb5-6 CTS are bolded. Mitochondrial localization (indicated as "Yes" or "No") was assessed based on colocalization (or lack thereof) of the expressed protein and endogenous mitochondrial CoxII. Shown on the **right** in both are representative immuno-epifluorescence micrographs illustrating the localization of the various constructs shown on the **left**. Each micrograph is labeled with the name of the expressed Myc-tagged TraB-Cb5-6CTS or CTS mutant version of TraB-Cb5-6CTS, or endogenous CoxII. Bar = 10μm.

Given the apparent importance of basic amino acid residues in the dibasic motif, and that, with the exception of APX-5, all of the mitochondrial-TA proteins that possess a dibasic motif also lack an acidic amino acid in their CTS (see **Table 1**), we examined next the influence of adjacent acidic residues on the dibasic motif. As shown in **Figure 7**, replacement of the C-terminal amino acid residue in the CTS of TraB, Cb5-6, and PMD2, with a glutamic acid (TraB-SRRE, Cb5-6-SRKE, and PMD2-SRRE), which places this acidic residue immediately downstream of the dibasic sequence, abolished mitochondrial targeting. These results are consistent with those published previously for tung Cb5D, whereby "X" in the dibasic motif could not be a glutamate (Hwang et al., 2004). Replacement of the C-terminal amino acid residue in TraB or Cb5-6 with an aspartate residue (TraB-SRRD and Cb5-6-SRKD), however, did not disrupt mitochondrial targeting (**Figure 7**). Moreover, the addition of a glutamic acid residue to the C-terminal end of the TraB CTS (TraB-SRRKE), did not abolish targeting, nor did placement of a glutamic acid before the dibasic sequence in TraB (TraB-SERRK or TraB-ERRK) (**Figure 7**). The results of the latter two constructs were complicated, however, by the fact that there are three basic residues in the CTS of TraB (-SRRK), and so it is not entirely clear which (or both) of the two pairs of basic residues serve as the targeting signal. To address this, we employed TraB-Cb5-6CTS as an alternate template, since the CTS of this chimera consists of only a single dibasic sequence (i.e., - SRKT). As shown in **Figure 7**, insertion of a glutamic acid residue just before the dibasic sequence in the CTS of TraB-Cb5-6CTS

(TrabB-SERKT) did not disrupt targeting to mitochondria. On the other hand, insertion of a glutamic acid residue between the dibasic amino acids in TraB-Cb5-6CTS (TrabB-SREKT) did abolish mitochondrial targeting (**Figure 7**).

Taken together, these data significantly extend our understanding of the dibasic motif by revealing that an acidic amino acid is tolerated when located upstream of the dibasic motif, but not when placed downstream or within the motif, although in the latter instance aspartic acid is tolerated at the downstream position.

#### **BIOINFORMATICS ANALYSIS USING THE EXPANDED DIBASIC MOTIF IDENTIFIES NEW OMM-TA PROTEINS**

Based on the mutational analysis described above (**Figures 5**–**7**), we developed an expanded version of the dibasic targeting signal motif that accounts for the sequence variability now known to be acceptable for TA protein sorting to mitochondria in plant cells: -R/K/H-X{0,1-=E} -R/K/H{-=−H−H−or−H−X−H−}- X{0,1-=E} X{0,3}{CTS=3,8} , whereby the dibasic sequence can be any two basic amino acid residues, other than two histidines, and can be contiguous or separated by any one amino acid residue, other than a glutamic acid. The dibasic sequence can be also positioned 0 to 4 amino acid residues away from the C terminus, as long as the residue immediately downstream of the dibasic sequence is not a glutamic acid. Finally, the motif is considered to function only in the context of a CTS that is 3–8 amino acid residues in length, although this criteria is not as strict since there are no fully reliable methods for

#### **FIGURE 7 | Localization of various CTS mutant versions of mitochondrial-TA proteins in BY-2 cells.** Shown on the **left** are schematic illustrations of wild-type and/or various CTS mutant versions of TraB, Cb5-6, or PMD2 and their corresponding localization (or lack thereof) to mitochondria in transformed BY-2 cells. The names of the mutants represent the specific amino acids in their modified CTSs. All constructs also possess an N-terminal-appended Myc-epitope tag. Shown for each construct is the corresponding C-terminal amino acid sequence, including putative TMD (underlined) and modified (or wild-type) CTS from

TraB, Cb5-6 and PMD2; modified amino acid residues in the protein's CTS are bolded. Mitochondrial localization (indicated as "Yes" or "No") was assessed based on colocalization (or lack thereof) of the expressed protein and endogenous mitochondrial CoxII. Shown on the **right** in both are representative immuno-epifluorescence micrographs illustrating the localization of the various constructs shown on the **left**. Each micrograph is labeled with the name of the expressed Myc-tagged wild-type and/or CTS mutant version of TraB, Cb5-6, or PMD2 or endogenous CoxII. Bar = 10μm.

predicting the end(s) of a TMD and thus the precise length of the CTS.

Using this new dibasic motif and also taking into consideration that OMM-TA proteins are traditionally defined as possessing a single C-terminal TMD of relatively moderate hydrophobicity and devoid of a cleavable N-terminal presequence (Borgese et al., 2003; Abell and Mullen, 2011), we performed a bioinformatics search of all the TA proteins predicted for the *A. thaliana* deduced proteome (Kriechbaumer et al., 2009; Pedrazzini, 2009; Dhanoa et al., 2010) and by doing so we identified a total of 32 proteins (**Table 2**). Among these proteins were all of the dibasic motifcontaining OMM-TA proteins from **Table 1**. Interestingly, the list also included a protein annotated as a third isoform of MIRO (MIRO3) (Yamaoka and Leaver, 2008), as well as several other proteins that are annotated by SUBA or AmiGO to be localized to and/or function at mitochondria, including a rhomboid-like protein (At1g18600), a putative lipoprotein (At4g31030), and several unknown proteins (i.e., At1g72020, At4g38490, and At5g35470). In addition, a number of proteins (i.e., 18 of 32) in **Table 2** are annotated by SUBA as being uncharacterized in terms of their intracellular localization.

To determine whether any of these proteins represented *bona fide* OMM-TA proteins, we focused on At1g55450, which is of unknown function, but, based on information provided at TAIR, does contain an S-adenosyl-L-methionine (SAM)-dependent methyltransferase domain in its N-terminal region. To investigate whether At1g55450 was localized to mitochondria, we transiently expressed an N-terminal Myc-tagged version in BY-2 cells and visualized the cells using immunostaining and CLSM. This analysis revealed that the protein localized to ring-shaped structures that encircled the punctate immunofluorescence pattern of endogenous CoxII (**Figure 8A**), similar to the localization pattern of other OMM-TA proteins presented in **Figure 1**. Notably, similar OMM localization was also observed for another protein from **Table 2**, namely MIRO3 (**Figure 8A**).

We next determined whether the unknown protein At1g55450 adopted a TA protein topology, with the N terminus orientated toward the cytoplasm. As shown in **Figure 8B**, the Myc epitope appended to the N-terminus of the At1g55450 protein (Myc-At1g55450) was indeed detectable in either Triton X-100- or digitonin-pemeabilized BY-2 cells (**Figure 8B**), similar to the other TA proteins presented in **Figure 2**. To confirm that this protein could integrate into mitochondrial membranes, we performed *in vitro* mitochondrial targeting studies. The protein was first synthesized and radiolabeled using *in vitro* transcription/translation reactions then incubated with isolated mitochondria. Following incubation, mitochondria were washed with alkaline sodium carbonate to remove peripherally-associated proteins, then membranes were pelleted by centrifugation. As shown in **Figure 8C**, Myc-At1g55450, as well as Myc-TraB, was resistant to alkaline sodium carbonate extraction, which operationally defines both of these proteins as integral membrane proteins.

Consistent also with the other dibasic-motif-containing mitochondrial-TA proteins, deletion of the C-terminal TA region of Myc-At1g55450 (Myc-At1g55450-C26) abolished mitochondrial targeting, while fusion of this same region to GFP (GFP-At1g55450+C26) resulted in localization to mitochondria (**Figure 8D**). As such, the TA region was both necessary and sufficient for targeting to mitochondria, similar to the other dibasic-containing proteins (**Figure 3**). Finally, stable expression of Cherry-tagged At1g55450 in transgenic *A. thaliana*, also transformed with a mitochondrial marker fusion protein consisting of the *Nicotiana plumbaginifolia* β-ATPase N-terminal mitochondrial presequence fused to GFP [(Mito-GFP) (Logan and Leaver, 2000)], resulted in colocalization of the two proteins in living cells in roots and hypocotyls (**Figure 8E**). Taken together, these data clearly define At1g55450 as a new OMM-TA protein in plants. Whether the other proteins in **Table 2** are also *bona fide* members of the OMM subproteome remains to be determined.

# **CONCLUSIONS**

While in recent years considerable progress has been made toward understanding the biogenesis of TA proteins in yeast and mammals, relatively little is known about how these proteins are properly partitioned within plant cells. Compounding this problem is the added complexity associated with how TA proteins must differentiate between not only mitochondria and ER (as they do in yeast and mammals), but also, unique to plant cells, plastids (reviewed in Abell and Mullen, 2011; Kim and Hwang, 2013; Lee et al., 2014). One main reason for this paucity of knowledge is that so few plant TA proteins have been identified, let alone characterized in terms of their organelle-specific targeting information and/or the cellular machinery involved in their membrane import and assembly. Moreover, at least some of the plant TA proteins examined to date appear to localize to different compartments depending on cell type and/or experimental condition (Scott et al., 2006; Maggio et al., 2007; Lingard et al., 2008; Zhang and Hu, 2009; Aung and Hu, 2011; Sun et al., 2012; Ruberti et al., 2014), implying that the targeting of TA proteins in plants is even more complex, and substantially different from the targeting of TA proteins in yeast and mammals.

Herein we describe an important step toward a better understanding of plant TA protein biogenesis by showing that a dibasic targeting signal motif that was previously identified in a mitochondrial isoform of Cb5 (Hwang et al., 2004) is also present in a number of other mitochondrial-TA proteins in plants, and, this motif is absent in all but one of the TOM-TA proteins. We also showed that the motif is far more divergent than previously defined, including the acceptable combination of basic amino acids, the contiguous nature of the dibasic sequence, the length of the CTS, and the relative position of the dibasic sequence within the CTS (**Figures 5**–**7**). Furthermore, we utilized this motif to identify a number of new, putative OMM-TA proteins (**Table 2**), including a protein that while annotated to be of unknown function, is predicted to possess an N-terminal SAM-dependent methyltransfase domain and thus might operate in corresponding manner at the cytosolic surface of the mitochondrion. Of course, one important caveat of this study is that the consensus targeting sequence we define here for OMM-TA proteins, like most other discrete sequence-specific targeting signal motifs (e.g., PTS1, -HDEL, and dilysine ER retrieval signals, etc.), is probably highly context dependent. As such, it is not unreasonable to expect that additional variations of the motif

#### **Table 2 | Candidate** *Arabidopsis* **TA proteins containing a putative OMM dibasic targeting signal motifa.**


*aAll of the proteins listed are known or predicted to possess a TA orientation (Kriechbaumer et al., 2009; Pedrazzini, 2009; Dhanoa et al., 2010) and also contain a C-terminal mitochondrial dibasic targeting signal motif according to the results of the mutational analysis of selected TA proteins presented this study (see Figures 5–7). See text, including Materials and Methods, for additional details.*

*bAGI number represents the systematic designation given to each locus, gene, and its corresponding protein product(s) in TAIR. Proteins are listed in ascending order based on their AGI number.*

*cCommon nomenclature of Arabidopsis TA proteins based on published literature and TAIR. Proteins indicated with an asterisk are also listed in Table 1; proteins indicated with two asterisks were experimentally characterized in terms of their intracellular localization and, for At1g55450, also TA orientation, membrane integration, and C-terminal-tail-dependent targeting; see Figure 8 and text for additional details.*

*dShown for each protein is its deduced C-terminal tail sequence, including its putative TMD (underlined), based on TOPCONS and visual inspection, and its downstream CTS. Shaded are the dibasic amino acid residues within the mitochondrial dibasic targeting signal motif (i.e., -R/K/H-X{0,1*-=*E}-R/K/H{*-=−*H*−*H*−*or*−*H*−*X*−*H*−*} - X{0,1*-<sup>=</sup>*E }X{0,3}{CTS*<sup>=</sup>*3,8}) found in the CTS of all the proteins shown.*

*eShown for each protein is its intracellular localization(s) based on published proteomics and/or GFP localization results presented at SUBA3 (The SUBcellular localization database for Arabidopsis proteins) (Tanz et al., 2013). Abbreviations: ER, endoplasmic reticulum; Mito, mitochondria; Perox, peroxisome; PM, plasma membrane; Vac, vacuole.*

*<sup>f</sup> Shown for each protein is "cellular component" ontology, i.e., where a gene product is located in is a subcompartment of a particular cellular component, based on the AmiGO search tool at the Gene Ontology database (http:// www*.*geneontology*.*org) (Ashburner et al., 2000). Abbreviations: Cyt, cytosol; TGN, trans-Golgi network; see also above (footnote e).*

**FIGURE 8 | Localization, topology, membrane insertion, and C-terminal targeting-signal analysis of a novel OMM-TA protein, At1g55450. (A)** Representative CLSM micrographs illustrating the localization of N-terminal Myc-tagged At1g55450 or MIRO3 to the OMM in BY-2 cells. Cells were processed for immunofluoescence CLSM as in **Figure 1**. Shown in the three panels on the right are images corresponding to a portion of the cell at higher magnification. Solid arrowheads indicate examples of the torus-shaped fluorescent structures containing the transiently-expressed protein delineating the spherical structures attributable to endogenous CoxII. **(B)** Topological mapping of Myc-At1g55450 in differential-permeabilized BY-2 cells. Cells transiently-transformed with N-terminal Myc-tagged At1g55450 were formaldehyde fixed and permeabilized with either Triton X-100 or digitonin, and then cells were processed for immuno-epifluorescence microscopy, as described in **Figure 2**. **(C)** Insertion of At1g55450 into mitochondrial membranes *in vitro*. Isolated pea mitochondria were incubated with *in vitro* synthesized Myc-At1g55450 (lanes 1 and 2) or, for comparative purposes, Myc-TraB (lanes 3 and 4), and then

resuspended (+) or not (−) in alkaline Na2CO3. Equivalent amounts of each alkaline Na2CO3- or mock-extracted sample were then subjected to SDS-PAGE and phosphoimaging. **(D)** Representative immuno-epifluorescence micrographs illustrating the localization of a C-terminal mutant or GFP fusion of At1g55450 in BY-2 cells. Each micrograph is labeled with the name of either the transiently-expressed Myc-tagged C-terminal mutant or GFP fusion protein or endogenous CoxII. The name of each construct includes the number of amino acid residues that were either deleted from the C terminus of Myc-tagged At1g55450 (−C26) or fused to the C terminus of GFP (+C26). **(E)** Representative CLSM micrographs illustrating the localization of the Cherry-At1g55450 fusion protein to mitochondria in living transgenic *A. thaliana* seedlings co-expressing the mitochondrial marker protein mito-GFP. Labels above the panels indicate the name of the co-expressed protein and labels in panels on the left indicate the seedling tissue type. Note in the top row that not all root cells expressed the Cherry-At1g55450 fusion protein. Bars in **(A** and **B)** and **(D** and **E)** = 10μm.

exist and that these operate in a cooperative manner with targeting elements conveyed by the physicochemical and perhaps sequence-specific properties within the upstream regions, particularly within the TMD, of different proteins. Hence, why we observed at least some contradictions between the functional definition of the dibasic targeting motif in the proteins examined in this study and that reported previously for Cb5 (Hwang et al., 2004), and why some of the proteins listed in **Table 2** may target not only to mitochondria, but also to other organelles (i.e., peroxisomes or chloroplasts), or perhaps do not target to mitochondria. Regardless, the future study of these proteins, including those that perhaps localize to other organelles, will still serve to provide a better understanding of TA protein targeting pathways in plants in general.

Another critical aspect of plant mitochondrial-TA protein biogenesis that remains obscure is the biogenesis of TA-TOM proteins, which, based on the results of this study (**Figures 3**, **4**), appear to involve targeting information that is distinct from that of dibasic-motif-containing mitochondrial-TA proteins. Like other TA proteins that are components of the protein translocons at the ER or chloroplasts, TA-TOM proteins have been proposed to have evolved during a process whereby TA proteins, due to their relatively simple structure, inserted into membranes in early cells with minimal assistance and then over time mediated the subsequent insertion of more complex proteins (Borgese and Righi, 2010). As such, one possibility is that, similar to the scenario proposed recently for mitochondrial-TA proteins in yeast (Krumpe et al., 2012) and TA-TOC proteins (Dhanoa et al., 2010), the inherent unique lipid composition of the OMM, possibly in conjunction with cytoplasmic chaperones [or perhaps without them (Kriechbaumer et al., 2009)], might specify the targeting and insertion of TA-TOM proteins in plant cells. Alternatively, or in addition to, the targeting and insertion specificity of plant TOM-TA proteins, and perhaps mitochondrial-TA proteins in plants in general, might involve a modified version of the GET pathway, which, as described in the Introduction, normally mediates ER-TA protein biogenesis (Denic, 2012). For instance, *A. thaliana* possess three putative orthologs of the yeast GET3 protein (Abell and Mullen, 2011; Duncan et al., 2013), including one that, like its yeast and mammalian counterparts, localizes to the cytoplasm and is presumed to play a role in directing TA proteins to the ER (Abell and Mullen, 2011; Duncan et al., 2013). The other two *A. thaliana* GET3 proteins, however, localize to the chloroplast stroma (Rutschow et al., 2008; Ferro et al., 2010) or OMM (Duncan et al., 2013), where they are thought to play an analogous role in TA protein biogenesis in the respective organelle (Duncan et al., 2013). However, the observation that plants appear to lack homologs to the other protein components of the GET machinery (i.e., GET1, 2, 4, and 5), has led to the suggestion that GET3 proteins in plants, including mitochondrial GET3, mediate TA protein biogenesis in a unique manner compared to the GET pathway in yeasts and mammals (Duncan et al., 2013). Future studies are required to test this hypothesis, as well as determine the potential role of membrane lipids and/or other sorting machinery, such as the TOM complex or otherwise [e.g., SAM (Stojanovski et al., 2007)], in the biogenesis of mitochondrial-TA proteins in plant cells.

# **AUTHOR CONTRIBUTIONS**

Naomi J. Marty, Yeen Ting Hwang, Howard J. Teresinski, and Eric A. Clendening generated plasmid DNA constructs and/or performed subcellular localization studies in BY-2 cells; Satinder K. Gidda also assisted with subcellular localization studies and generated the *A. thaliana* plants stably expressing Cherry-At1g55450 and mito-GFP; Elwira Sliwinska performed the microscopic analysis of transgenic *A. thaliana* seedlings; Daiyuan Zhang generated selected plasmid DNA constructs; Ján A. Miernyk performed the isolated mitochondria *in vitro* membrane insertion experiments; Glauber C. Brito, and David W. Andrews provided TAMP and participated in the bioinformatics analysis of predicted *A. thaliana* OMM-TA proteins; and John M. Dyer and Robert T. Mullen designed the study and wrote the paper.

# **ACKNOWLEDGMENTS**

We thank D. Logan and T. Elthon for their generous gifts of transgenic *A. thaliana* seeds or antibodies, M. Johnston and E. Hoyos for their assistance with *in vitro* mitochondrial targeting experiments, and T. Nguyen and E. Anderson for maintaining BY-2 cell cultures. This study was supported by a grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) to Robert T. Mullen, an NSF award (IOS 0325656) to Ján A. Miernyk, and the United States Department of Agriculture, CRIS Project 5347-21000-012-00D, to John M. Dyer. Howard J. Teresinski was supported by an Ontario Graduate Scholarship, and financial support for Elwira Sliwinska was provided in part by NSERC [to J. D. Bewley (University of Guelph)]. Robert T. Mullen holds a University of Guelph Research Chair.

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 April 2014; accepted: 12 August 2014; published online: 04 September 2014.*

*Citation: Marty NJ, Teresinski HJ, Hwang YT, Clendening EA, Gidda SK, Sliwinska E, Zhang D, Miernyk JA, Brito GC, Andrews DW, Dyer JM and Mullen RT (2014) New insights into the targeting of a subset of tail-anchored proteins to the outer mitochondrial membrane. Front. Plant Sci. 5:426. doi: 10.3389/fpls.2014.00426*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Marty, Teresinski, Hwang, Clendening, Gidda, Sliwinska, Zhang, Miernyk, Brito, Andrews, Dyer and Mullen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Border control: selectivity of chloroplast protein import and regulation at the TOC-complex

# *Emilie Demarsy†, Ashok M. Lakshmanan† and Felix Kessler\**

Laboratory of Plant Physiology, University of Neuchâtel, Neuchâtel, Switzerland

#### *Edited by:*

Kentaro Inoue, University of California at Davis, USA

#### *Reviewed by:*

Matthew D. Smith, Wilfrid Laurier University, Canada Takehito Inaba, University of Miyazaki, Japan

#### *\*Correspondence:*

Felix Kessler, Laboratory of Plant Physiology, Université de Neuchâtel, UniMail, Rue Emile Argand 11, 2000 Neuchâtel, Switzerland e-mail: felix.kessler@unine.ch

†Emilie Demarsy and Ashok M. Lakshmanan have contributed equally to this work.

Plants have evolved complex and sophisticated molecular mechanisms to regulate their development and adapt to their surrounding environment. Particularly the development of their specific organelles, chloroplasts and other plastid-types, is finely tuned in accordance with the metabolic needs of the cell. The normal development and functioning of plastids require import of particular subsets of nuclear encoded proteins. Most preproteins contain a cleavable sequence at their N terminal (transit peptide) serving as a signal for targeting to the organelle and recognition by the translocation machinery TOC–TIC (translocon of outer membrane complex–translocon of inner membrane complex) spanning the dual membrane envelope. The plastid proteome needs constant remodeling in response to developmental and environmental factors. Therefore selective regulation of preprotein import plays a crucial role in plant development. In this review we describe the diversity of transit peptides and TOC receptor complexes, and summarize the current knowledge and potential directions for future research concerning regulation of the different Toc isoforms.

**Keywords: plastids, protein import,TOC complex, preproteins, post-translational modifications**

# **INTRODUCTION**

Eukaryotic cells are composed of multiple compartments that acquire specialized sets of proteins for function. The vast majority of proteins are encoded by the nuclear genome. After synthesis in the cytosol accurate protein sorting and export toward their destination organelles rely on intrinsic topogenic sequences (Blobel, 1980). Initially, correct recognition of a preprotein requires specific receptors at the surface of the organelle. This crucial step of intracellular trafficking control can be viewed as a key–lock type mechanism.

Plant chloroplasts import impressive quantities as well as an enormous diversity of proteins from the cytosol. Large scale proteome studies indicate that 2000–4000 different proteins follow the chloroplast route (Ferro et al., 2003; Leister, 2003; Friso et al., 2004; Kleffmann et al., 2004). In the cytosol, chloroplast proteins are generally synthesized as preproteins with a N-terminal targeting sequence that is cleaved to produce the mature chloroplast protein upon import. This N-terminal targeting sequence, named transit peptide in the context of chloroplast protein import, faithfully guides the preprotein to the chloroplast surface where it engages the import machinery. In the following, the preprotein is translocated across the dual envelope membranes into the stroma. The transit sequence is cleaved upon arrival in the stroma yielding the mature form of the protein followed by folding in the stroma, targeting to the inner membrane via the conservative sorting pathway, or transport to the thylakoid membrane system. The recognition and translocation of the preprotein at the plastid envelope is provided by the TOC–TIC (translocon of outer membrane complex–translocon of inner membrane complex (TOC–TIC) import machinery. In pea, the core TOC complex consists of an assembly of the two GTP dependent receptors Toc34 and Toc159 together with the β-barrel protein conducting channel Toc75 (Hirsch et al., 1994; Kessler et al., 1994; Perry and Keegstra, 1994; Schnell et al., 1994; Becker et al., 2004). Upon engagement of the preprotein, the TOC complex associates with the TIC complex to form a continuous channel through the plastid envelope. The protein conducting channel at the TIC complex has been suggested to be made up of Tic110 or Tic20, or yet a combination of the two. Recently, however, it has been suggested that four core components form a 1MDa TIC channel [Tic20, Tic214 formerly known as YCF1, Tic56, and Tic100; (Kikuchi et al., 2013)]. Protein synthesis and targeting involve a large variety of cellular activities that are energy-requiring. Solely translocation of a single preprotein across the chloroplast envelope through the TOC–TIC machinery requires the hydrolysis of 650 ATP molecules on average, representing about 0.6% of the total light-saturated energy output of the organelles (Shi and Theg, 2013). Therefore a tight control of TOC–TIC mediated import activity is required to respect the cellular energy budget allocated to protein import.

Plants originate from a primary endosymbiotic event involving a photosynthetic cyanobacterium captured by a eukaryotic cell. The evolution of plants toward complex and multicellular organisms has been accompanied by the diversification of interconvertible plastid types displaying distinct and highly specialized biochemical and physiological functions (Jarvis and Lopez-Juez, 2013). For instance the most prominent plastid type, the chloroplast, develop from proplastid, or partially differentiated, non-photosynthetic etioplast, and can also differentiate into other non-photosynthetic plastid types such as chromoplast or elaioplast. Each plastid type requires the import of different subsets of proteins (Kleffmann et al., 2007; Brautigam and

Weber, 2009; Barsan et al., 2012). Several strategies have evolved coordinately to ensure the selective import of plastid proteins. Together with the defined regulation of preprotein availability at the transcriptional levels, evolution also triggered diversification and increased complexity of both preprotein transit sequences (von Heijne and Nishikawa, 1991; Li and Teng, 2013) and composition of the import machinery (Reumann et al., 2005; Kalanon and McFadden, 2008; Gross and Bhattacharya, 2009; Shi and Theg, 2013). Evidence for the existence of different isoforms of TOC complex components has now been reported for several higher plant species including *Arabidopsis*, pea, and tomato (Jackson-Constan and Keegstra, 2001; Chang et al., 2014;Yan et al., 2014). Each isoform is thought to preferentially import a specific subset of client preproteins that may be the result of differential binding affinity (Jelic et al., 2003; Smith et al., 2004; Inoue et al., 2010; Dutta et al., 2014). Therefore, the relative abundance of Toc isoforms may reflect the protein composition of a given plastid type and be a key marker of plastid identity (Ling et al., 2012).

On top of that, plants are sessile organisms and need to adapt to ever-changing environmental conditions. Dynamic regulation of TOC complex composition may occur at the posttranslational level and represent a key regulatory mechanism contributing to the change in protein composition. By consequence this allows rapid modulation of plastid metabolism to ensure and drive plant development and acclimation. Thus, the relative abundance of Toc receptor may not only be a marker of plastid type but also of plastid state (Agne et al., 2010; Ling et al., 2012).

The molecular mechanisms underlying the process of protein translocation have been reviewed extensively (Jarvis, 2008; Andres et al., 2010; Li and Chiu, 2010). Here, we present the current knowledge with regard to the selectivity and the regulation of the preprotein import process at the level of the TOC complex.

#### **PREPROTEIN IMPORT IN PLASTID IS REGULATED BY DEVELOPMENTAL AND ENVIRONMENTAL FACTORS**

Years before the identification of any of the components of the chloroplast protein import machinery (Dahlin and Cline, 1991) proposed that import activity is correlated with protein demands during plastid development. They observed a high import activity in non-photosynthetic proplastids, which gradually decreased as plastids matured. This phenomenon was observed for etioplast as well as chloroplast development. Interestingly, when dark-grown plants were shifted from dark to light the import activity of etioplasts was activated to accommodate the set of preproteins required for chloroplast differentiation (Dahlin and Cline, 1991). This seminal study focused on a few substrates and, given the experimental limitations at the time, was unable to provide a complete picture of plastid protein import dynamics. Recently, this topic was reinvestigated using a larger number of chloroplasts precursors proteins (Teng et al., 2012). This study confirmed that preprotein specificity is modulated in synchrony with chloroplast developmental stages. Interestingly, this study demonstrated that the earlier results by Dahlin and Cline (1991) cannot be extended to all import substrates. Rather, Teng et al. (2012) refined the model and classified the substrates according to their importability in chloroplasts at different developmental

stages and consequently defined three age-selective classes: substrates that are imported more efficiently in young chloroplasts (group I), in older chloroplasts (group III), whereas group II represents substrates that are imported similarly in developing and mature chloroplasts. Thus, it appears that regulation of chloroplast preprotein import is part of a differential age-specific regulatory network.

*In vitro* import experiments using different isolated plastid types as well as the visualization of protein targeting using transgenic lines expressing transit peptides fused to GFP support the notion that import selectivity is regulated in a tissue specific manner (Wan et al.,1996; Primavesi et al.,2008;Yan et al.,2014). Finally temperature stress (cold and heat) on intact pea leaves and isolated chloroplasts was found to reduce import of the small subunit of RubisCO preprotein (pSSU; Dutta et al., 2009).

In summary, these results demonstrate that both plastid import activity and selectivity are modulated in accordance with plastid type, developmental stage, and environmental condition. For this purpose plants have evolved a complex set of preprotein import components with specialized features and regulatory mechanisms (Jarvis et al., 1998; Kubis et al., 2004).

# **PREPROTEIN SELECTIVITY AT THE CHLOROPLAST IMPORT MACHINERY**

#### **OVERVIEW OF THE TOC–TIC MACHINERY**

The TOC-TIC pathway (translocon of outer membrane complextranslocon of inner membrane complex) is the major protein import pathway in higher plants (Bauer et al., 2000; Asano et al., 2004; Kovacheva et al., 2005). Most of the proteins with cleavable transit peptides that are targeted to the stroma, thylakoid membranes and lumen follow this route, that is therefore vital for plastid biogenesis (Kessler and Schnell, 2006; Bischof et al., 2011; Dutta et al., 2014). The native TOC–TIC complex in pea and *Arabidopsis* has been found to include two GTPase-receptors Toc159 and Toc34, a channel protein Toc75 and at least three additional regulatory Toc proteins Toc64, Toc22, and Toc12 (Andres et al., 2010). At the inner membrane at least 11 different proteins have been reported to be involved in the import process (Kovacs-Bogdan et al., 2010; Kikuchi et al., 2013). Electrophysiological experiments suggested that Tic110 and Tic20 could function as channels facilitating the translocation of preproteins across the inner membrane (Kikuchi et al., 2009, 2013; Kovacs-Bogdan et al., 2011). These two channels are thought to function independently and in different complexes (Kikuchi et al., 2009, 2013; Kovacs-Bogdan et al., 2011). This is supported by the finding that Tic110 interacts with preproteins and TOC complexes (Schnell et al., 1994; Inaba et al., 2005) but not with Tic20 (Kikuchi et al., 2009). Tic110 is a protein of eukaryotic origin present in various plastid-containing organisms (Shi and Theg, 2013). Its function is indispensable for plant viability and chloroplast biogenesis (Inaba et al., 2005). Based on these data it was proposed that Tic110 has an essential role in chloroplast protein import. Recently, composition of the Tic20 complex in *Arabidopsis* has been investigated using Blue Native PAGE and mass spectrometric analyses. The results suggested that Tic20 associates with Tic56, Tic100, and Tic214 (Kikuchi et al., 2013). Although Tic20 is of prokaryotic origin and is well conserved

among the plant kingdom, Tic56, Tic100, and Tic214 appear to have specifically evolved in a limited number of higher plant species only (Kikuchi et al., 2013). Tic214, also known as YCF1, is absent from the genome of some Poacae species (Jensen and Leister, 2014; Smith and Lee, 2014), thus the role of TIC20 complex as the general inner chloroplast membrane translocon in higher plants is questionable. Nevertheless the albino, seedling lethal phenotype of null mutants of each of the TIC20 complex subunits underscores their functional importance at least in *Arabidopsis*. In conclusion the exact contribution of TIC110 and TIC20 complexes in chloroplast protein import is still under debate.

At the evolutionary level, a view of growing complexity of the composition of TOC machinery is emerging (Shi and Theg, 2013). Starting with one channel protein at the outer envelope in cyanobacteria, the outer envelope protein import complex has evolved into a GTP-regulated multi-protein complex in higher plants (Olsen and Keegstra, 1992; Schnell et al., 1994; Hiltbrunner et al., 2001a; Kessler and Schnell, 2002; Voulhoux et al., 2003). The Toc receptors can form homo- and heterodimers in a dynamic way regulated by preprotein binding and GTP binding/hydrolysis activity (Smith et al., 2002; Sun et al., 2002; Wallas et al., 2003; Lee et al., 2009b; Rahim et al., 2009; Oreb et al., 2011). Although GTP binding and GTPase activity seem dispensable (expression of GTPase/dimerization-defective Toc159 and Toc33 complement the corresponding knock out mutants), it has been shown that they are required for full preprotein import efficiency *in vitro* (Agne et al., 2009; Aronsson et al., 2010; Aronsson and Jarvis, 2011). In most higher plants the Toc75 channel is encoded by a single gene (Inoue and Keegstra, 2003), but normally more than one homolog for the plastid specific GTPase families Toc159 and Toc34 exists, and thus there is the possibility of making various combinations of TOC complexes (Hiltbrunner et al., 2001a; Chang et al., 2014; Yan et al., 2014). The evolution of a translocation route depending on GTP-binding as well as other accessory proteins may be seen as the key to the developmental stage specific regulation of protein import in higher plants (Schleiff and Soll, 2005; Gagat et al., 2013).

#### **DIVERSITY AND FUNCTIONAL SPECIFICITIES OF TOC GTPase RECEPTORS**

Members of Toc159 family are characterized by three distinct domains: M- (membrane anchoring) domain, G- (GTP-binding) domain, and a highly acidic, intrinsically disordered A-domain (**Figure 1**). There are four homologs in *Arabidopsis thaliana*: atToc159, −132, −120, and −90. While they share high similarity in their G- and M-domains, they largely differ in length and sequence at their A-domains (Jackson-Constan and Keegstra, 2001; Hiltbrunner et al., 2001a). Toc34 proteins are smaller, membrane-anchored GTPases. In pea, only one member has been detected so far while two isoforms of Toc34 (atToc34 and atToc33) have been identified in *Arabidopsis* (Jarvis et al., 1998; Gutensohn et al., 2000).

Genetics and biochemical studies have supported the idea that various combinations of the different Toc GTPase isoforms lead to a diversity of complexes displaying differential selectivity for preprotein recognition and translocation (Kubis

et al., 2003, 2004; Constan et al., 2004; Ivanova et al., 2004). Co-immunoprecipitation experiments performed by Ivanova and collaborators demonstrated that atToc159 preferentially associates with atToc33, while atToc120, and/or atToc132 preferentially form a complex together with atToc34 (Ivanova et al., 2004). Interestingly, the *toc34 (ppi3)* knock out mutant has no visible defect, while the *toc33 (ppi1)* mutant displays a pale green phenotype with a chloroplast biogenesis defect similar (although much less severe) than the *toc159* mutant phenotype (*ppi2*), supporting the proposition that these latter two receptor isoforms function in the same complex and preprotein import pathway (Jarvis et al., 1998; Bauer et al., 2000; Kubis et al., 2003, 2004; Constan et al., 2004).

Several lines of evidence indicate a potential functional overlap of the two Toc34 members: the strong sequence similarity: 65% (Jarvis et al., 1998); the fact that a minor fraction of atToc33 was co-immunoprecipated with Toc120/132, and atToc34 was detected with atToc159 (Ivanova et al., 2004); the embryo lethal phenotype of *toc33/toc34* double mutants and, most importantly, the ability of atToc34 to complement *ppi1* phenotype (Jarvis et al., 1998; Kubis et al., 2003; Constan et al., 2004). Transgenic complementation studies also indicated the potential functional overlap of atToc120 and atToc132 (Ivanova et al., 2004; Kubis et al., 2004) and, to a limited extent, for atToc159 and atToc90 (Infanger et al., 2011), however, no functional overlap exists between these two subgroups [atToc120/132 vs. atToc159/atToc90 (Ivanova et al., 2004; Kubis et al., 2004)]. While the two Toc34 homologs are mutually exchangeable, the same is only partially true for the Toc159 homologs, suggesting that preprotein selectivity of TOC complexes is mostly conferred by the identity of the Toc159 isoforms.

The classification of the client proteins of each isoform has been attempted. Because of the albino phenotype of *ppi2*, it has been proposed that Toc159 primarily facilitates the import of photosynthesis-associated preproteins. On the other hand, Toc132, or Toc120 being present predominantly in roots could facilitate that of constitutive (housekeeping) preproteins (Kubis et al., 2003, 2004; Ivanova et al., 2004; Smith et al., 2004; Inoue et al., 2010). *In vitro* import assays using a selection of substrates support this model (Smith et al., 2004; Inoue et al., 2010). However, the albino phenotype of the *ppi2* mutant was shown to result not only from a defect in the import of a set of chloroplast proteins, but also from the transcriptional downregulation of a specific set of nuclear genes associated with photosynthesis (Bauer et al., 2000; Kakizaki et al., 2009). This effect is commonly referred to as retrograde signaling, and pleiotropically affects albino and pale green mutants across the board. The interference of retrograde signaling with preprotein import in *ppi* mutants has blurred the identification of the specific substrates of each of the receptor isoforms. Comparative analysis of *ppi2* mutant proteome and transcriptome demonstrated that certain photosynthesis-associated proteins accumulated normally in plastids even in the absence of atToc159, whereas accumulation of some house-keeping proteins were strongly diminished despite their mRNA expression levels being similar to the wild type (Bischof et al., 2011). Furthermore, the results of a yeast two hybrid screen used to identify the preferred Toc receptor of a variety of preproteins supported to the finding of (Bischof et al., 2011; Dutta et al., 2014). Together these studies affirmed that Toc GTPases, especially the Toc159 homologs, confer specificity to plastid preprotein import. However, specificity is not likely to be based on the photosynthetic or housekeeping nature of a preprotein. This is a move away from the overly simplistic paradigm of "photosynthesis-associated" and "house-keeping" specificities toward a more differentiated model that reflects complex and varying plastid preprotein requirements during development and under environmental influence. Therefore, Toc client protein classification will need to be rethought along these lines. One hypothesis is that the combination of preprotein specificities of plastid resident Toc receptors reflects the tissue or cell specific preprotein accumulation patterns that are specific to a particular plastid type.

As mentioned above Toc159 homologs diverge the most at their A-domains, suggesting a key role in their functional specialization. In domain swapping experiments, Inoue et al. (2010) replaced the A-domain of atToc132 by that of atToc159. Expression of this construct partially restored chlorophyll accumulation in the *toc159* null mutant (*ppi2*), while no complementation was observed using a construct encoding atToc132 without an Adomain. These data elegantly demonstrated that the functional specialization relies at least partially on intrinsic properties of

the A-domain (Inoue et al., 2010). In agreement with this, it was observed that removal of the A-domains of atToc159 and atToc132 reduced the binding selectivity of these isoforms (Smith et al., 2004; Inoue et al., 2010; Dutta et al., 2014). Apparently, the A-domain does not directly interact with preproteins but may act as a filter enhancing the affinity for subsets of proteins and reducing the affinity for others (Dutta et al., 2014). Preprotein binding to Toc159 has been shown earlier to occur at the Gdomain (Smith et al., 2004). Thus it seems likely that the A-domain influences the G-domain by, for instance, positively, or negatively modulating access of a preprotein according to its nature. Finally, the lack of complementation of *ppi2* by atToc132 lacking an A-domain (Inoue et al., 2010) as well as the recent work of Smith et al. (2004) using a yeast two hybrid system to study the preprotein-Toc159 receptor isoforms affinity (Dutta et al., 2014) indicate that a degree of specificity is conferred by the G-domain itself.

#### **DIVERSITY AND COMPLEXITY OF THE TRANSIT PEPTIDES**

Inherently, recognizable specificity features would need to be encoded in the plastid transit peptides. One general consideration regarding the transit peptides is that no consensus can be defined, even when considering the structure at the three dimensional level (von Heijne and Nishikawa, 1991; Bruce, 2001). Plastid transit peptides largely vary in length from an average of 50 up to 146 amino acids (Li and Teng, 2013). There are some features shared with mitochondrial targeting peptides such as the overrepresentation of serine and threonine residues that may explain the targeting of plastid transit peptide containing proteins to mitochondria when expressed in heterologous animal systems (Zhang and Glaser, 2002). No further similarities between plastid and mitochondrial targeting sequences have been identified, and other levels of specificity might exist and enable plant cells to discriminate and accurately sort the two types of organellar proteins. Interestingly, an estimated thirty percent of chloroplast localized proteins do not have a canonical transit peptide (Ferro et al., 2003; Leister, 2003; Kleffmann et al., 2004, 2007; Jarvis, 2008). A recent study in pea indicated that this may be an overestimation that results from a slightly inaccurate algorithm that does not take into account the whole diversity of features of plastid transit peptides (Chang et al., 2014).

The diversity of transit peptides sequences might well be explained by the need to fine tune the import of specific subsets of proteins in agreement with plastid type and developmental stage. Toc159 binds preproteins via their N-terminal, transit peptides (Smith et al., 2004), so one might reasonably expect that the specificity determinants reside within this particular region. However, the determining sequence elements that confer selectivity to a Toc159 isoform have not yet been identified. They could consist of cryptic signals buried in motifs and multiplemotifs (Lee et al., 2009a; Bionda et al., 2010; Chotewutmontri et al., 2012). For example Lee et al. (2009a) revealed that Toc159 dependent import can be mediated by multiple independent motifs, one that consists in a stretch of serine residues located in first 12 amino acid of the N-terminal region of preRBCS (pSSU), and one located in the C-terminal part of the transit peptide sequence (Lee et al., 2009a). In a recent review, (Li and Teng, 2013) analyzed such motifs and their relation with binding sites for various proteins involved in preprotein import. The authors then attributed the preproteins to distinct subgroups based on patterns of sequence motifs in combination with their capacity to be targeted and bind to the protein translocon at the chloroplast outer envelope. Though only a limited number of preproteins were taken into account in these analyses, they clearly indicated that complexity of transit peptide design plays a key role in import selectivity.

#### **REGULATION OF TOC COMPONENTS**

#### **EXPRESSION PATTERN**

Regulation of TOC complex activity occurs at several levels. Overall the accumulation levels of Toc components throughout development appear to reflect the total import activity, i.e., a highest level of expression for the different components is observed in young, developing tissue, as compared to mature organs (Jarvis et al., 1998; Yu and Li, 2001; Kubis et al., 2003, 2004; Ivanova et al., 2004). As an exception, Toc90 appeared to be uniformly expressed throughout development (Kubis et al., 2003; Infanger et al., 2011). Specific patterns were revealed when comparing the expression levels of the different Toc receptors isoforms in different organs and/or different plastid types, and usually correlated with corresponding mutant phenotypes in *Arabidopsis* (Jarvis et al., 1998; Bauer et al., 2000; Gutensohn et al., 2000; Kubis et al., 2004; Yan et al., 2014). atToc159 and atToc33 are the most highly expressed members of their respective families and both mutants displayed the most severe visible phenotype when compared to other single mutants (Jarvis et al., 1998; Kubis et al., 2004). Furthermore, defects of plastid development in the corresponding mutants follow the expression pattern of the corresponding gene: highly regulated expression is observed for atToc159 and atToc33, with a higher expression occurring in photosynthetic tissues, when compared to other family members. Accordingly single mutants

of these genes are specifically affected in plastid type present in those tissues, i.e., the chloroplast and its precursor, the etioplast (Jarvis et al., 1998; Bauer et al., 2000). By the same token, the higher expression of atToc120 and atToc132 in roots correlates with a severe defect of root plastid development in the corresponding double mutant (Kubis et al., 2004). Similarly the mutant phenotype of atToc34, which is expressed more highly in roots, retains normal plastid development but displays reduced root length (Gutensohn et al., 2000; Constan et al., 2004). Thus, selectivity of import into plastids can be modulated at least in part by transcriptional regulation of Toc components in accordance with plant tissue and/or growth conditions (light conditions in the case of Toc159).

Expression profiles of the different Toc members suggest that the receptors acting together in a specific complex are co-regulated at the transcriptional levels. Interestingly, hierarchical cluster analysis indicates that this co-regulation extends to a large variety of conditions (**Figure 2**) and suggests that common *cis* and *trans* regulatory elements could regulate associated Toc receptors. In support of this idea, the CIA2 transcription factor was found to co-modulate atToc33 and atToc75 expression specifically in leaves (Sun et al., 2001, 2009). However, the identity of other transcription factors responsible for the differential expression of Toc members has been poorly investigated so far and further experimentation will be necessary to reveal the molecular mechanisms underlying the regulation of Toc gene expression.

#### **POST-TRANSLATIONAL MODIFICATIONS**

Differential regulation of Toc components also occurs at the posttranslational levels (**Figure 3**). It is interesting to note that the *ppi2* mutant can be complemented by expression of atToc159 under the constitutive 35S promoter indicating that transcriptional regulation can be bypassed at least under laboratory conditions (Kubis et al., 2004; Agne et al., 2009).

**expression.** TOC complexes consist of the assemblies of two different receptors from two separate GTPase families, Toc159/-132/-120/-90 and Toc33/-34, respectively, together with the Toc75 channel. Biochemical and genetic evidence have shown that atToc159 preferentially associates with These specific associations are reflected by co-regulation of the Toc receptors isoforms. Data were extracted from Genevestigator database (Nebion), using the Hierarchical Cluster analysis tool, with "Development" or "Anatomy" specific selections for left- and right-hand panels, respectively.

#### *Phosphorylation*

Several studies have shown that Toc receptors are phosphorylated. Phosphorylation has been reported for pea Toc34 and its ortholog atToc33 (Ser113 and S181, respectively), while it was not detected for atToc34 (Sveshnikova et al., 2000; Fulgosi and Soll, 2002; Jelic et al., 2002, 2003). Differential phosphorylation could therefore represent a regulatory mechanism conferring specificity to the two different members of *Arabidopsis* Toc34 family.

*In vitro* studies indicated that phosphorylation has a negative effect on GTP and preprotein binding to psToc34 and atToc33 (Sveshnikova et al., 2000; Jelic et al., 2003). Furthermore, *in vitro* and *in vivo* data showed that phosphorylation/phosphomimicking at atToc33 and phosphorylation of psToc34 negatively influenced TOC complex integrity (Oreb et al., 2008). Hypotheses for the underlying molecular mechanisms have been put forward. Since GTPase activity may be required for G-domainmediated association of Toc159 and Toc34 (Smith et al., 2002; Wallas et al., 2003), phosphorylation may indirectly prevent homo- as well as heterodimerization because of a negative effect on GTP-binding. More directly the bulky, negatively charged phosphate group could inhibit the binding to a preprotein or to Toc159. However, this latter hypothesis may be valid for *Arabidopsis*, but not for pea since the phosphorylation site is distant from the dimerization interface (Oreb et al., 2008). In summary, the available data suggest the phosphorylation of psToc34 and atToc33 have a dual function, regulating both TOC complex assembly and subsequent substrate binding.

The physiological relevance and the signals triggering this specific phosphorylation are still not clearly defined. Data obtained from *Arabidopsis* transgenic lines expressing phosphomicking variants of atToc33 confirmed that phosphorylation at S181 can inhibit atToc33 activity in young *Arabidopsis* seedlings but not later during development (Aronsson et al., 2006; Oreb et al., 2007). Indeed, phosphomimick variants resemble the *ppi1* mutant regarding a number of phenotypic traits in 5 dayold *Arabidopsis* seedlings (chlorophyll accumulation, chloroplast ultrastructure, and photosynthetic activity). However, since the non-phosphorylatable version behaved similarly to the WT, it was not possible to determine the conditions under which atToc33 is phosphorylated in *planta* (Aronsson et al., 2006; Oreb et al., 2007). We speculate that phosphorylation might represent a means to quickly down-regulate preprotein import *via* atToc33 containing TOC complexes, for example in mature plastids where protein demand is low. Moreover and since atToc33 can be phosphorylated but not atToc34, this post-translational regulation may affect the selectivity aspect of preprotein import regulation.

One additional phosphorylation site has been experimentally identified in both atToc33 and -34 [data provided by PhosphAT (Durek et al., 2010)]. It maps to a conserved Tyrosine residue of the G-domain. Additional studies will be required to validate and determine the regulatory effect of this specific phosphorylation.

Finally, the identity of Toc33/Toc34 kinase(s) still remain(s) mysterious. Some clues stemming from pea suggest that psToc34 is phosphorylated by an ATP-dependent, 98 kDa kinase residing at the outer envelope membrane (Fulgosi and Soll, 2002). However, the amino acid sequence information is not sufficient to molecularly identify the potential kinase in pea or its homolog in *Arabidopsis*.

The Toc159 receptors are also targets of phosphorylation. First evidence of phosphorylation of Toc159 came from *in vitro* studies using outer envelopes isolated from pea chloroplasts, showing that both full length Toc159 and its natural 86 kDa fragment could be phosphorylated (Fulgosi and Soll, 2002). Phosphorylation was demonstrated for the G-domain of psToc159, reminiscent of Toc33/34 regulation (Oreb et al., 2008), however, neither the precise site nor the regulatory function were further investigated. Large-scale phosphoproteomics projects revealed that Toc159 members in *Arabidopsis* are highly phosphorylated at the acidic A-domain (Agne et al., 2010; Durek et al., 2010). In total, 43 sites have been mapped in atToc159, while far fewer were detected in the other three members. These lower numbers may be due to the shorter length of the atToc132 and atToc120 A-domains, the absence of such a domain in atToc90, or because lower protein accumulation levels when compared to atToc159 limit the detection by mass spectrometry. Nevertheless the identified phosphorylation sites do not map to matching positions in the different homologs, which confers an additional degree of divergence to the A-domain.

The functional relevance of A-domain phosphorylation has been poorly documented so far. The dispensable nature of the A-domain suggests that phosphorylation either plays a minor role altogether, or possibly an important regulatory role under specific conditions (Hiltbrunner et al., 2001b; Agne et al., 2009; Inoue et al., 2010). The A-domain behaves as an intrinsically disordered protein, which is often linked to multiple and transient protein–protein interactions (Richardson et al., 2009). Therefore phosphorylation of this domain could modulate interactions of Toc159 with other Toc components but also with specific sets of client preproteins. In addition, a selective autoinhibitory function of the A-domain under specific conditions may be envisaged that may be alleviated by phosphorylation or proteolytic removal.

Recently a link between ABA signaling and phosphorylation of Toc159 family members in *Arabidopsis* has been established (Wang et al., 2013). Upon ABA treatment atToc159 was phosphorylated at Thr692. atToc120 and atToc132 phosphopeptides accumulation was also enhanced by ABA treatment. These data together with the fact that a mutant deficient in ABA synthesis is affected in pre-protein import and early plant development suggest a close link between ABA signaling and chloroplast protein import regulation via Toc159 A-domain phosphorylation (Zhong et al., 2010). Whether ABA dependent phosphorylation plays a role in preprotein recognition, impacts TOC159 complex assembly, or acts at the level of the translocation process will be interesting questions to be addressed in the future.

Several classes of kinases may mediate phosphorylation of Toc159 homologs. Motif analysis suggests that a large fraction of atToc159 phosphorylation sites represent potential cytosolic casein kinase 2 (CK2) targets and this was validated biochemically by *in vitro* phosphorylation experiments (Agne et al., 2010). Recently it has been shown that ABA dependent phosphorylation of atToc159 at Thr692 was decreased in a triple mutant *snrk2.2/2.3/2.6* that is

nearly insensitive toABA treatment (Wang et al.,2013). In addition SnRK2.6 phosphorylated recombinant atToc159 *in vitro*. Thus SnRK2.6 represents a potential kinase of atToc159 at Thr692. On the contrary, atToc120 and atToc132 phosphorylation upon ABA treatment was detected only in the triple mutant *snrk2.2/2.3/2.6*, indicating the involvement of another ABA regulated kinase. Indeed ABA signaling is mediated by multiple kinases of the SnRK family but also of the MAPK kinase family (Danquah et al., 2014). The phosphorylation status of Toc159 members could therefore be regulated antagonistically by ABA signaling via the action of different classes of kinases and could represent a way to switch between Toc132/Toc120 and Toc159 specific import depending on environmental as well as developmental conditions and consequent plastid preprotein requirements. Finally, it has been proposed that psToc159 is a target of a 70 kDa kinase located at the outer envelope of the pea chloroplast (Fulgosi and Soll, 2002) but so far no study has reported on the identification of a putative homolog in *Arabidopsis*.

In conclusion phosphorylation of the Toc159 and Toc34 receptors potentially regulates protein import at different levels: it may impact the import rate by regulating the affinity toward client preproteins, or affect the composition of the TOC complex by modulating the interaction between Toc receptors and consequently change the selectivity of plastid protein import. The involvement of ABA signaling in this regulation indicates that phosphorylation of Toc components can modulate the import activity in response to developmental signals for example during germination or subsequent post-germinative processes, or in response to abiotic stress that require the tuning of the plastid proteome. Hormonal control of plastid development has been frequently reported, but the effects on import activity are still poorly documented.

Phosphorylation could also be part of a signaling cascade enabling subsequent additional post-translational modifications (PTM) since cross talk between different is a common phenomenon in eukaryotic systems, and PTM other than phosphorylation have been described for the different Toc components (see below). The existence of numerous phosphorylation sites, especially in Toc159 families, suggests the participation of multiple kinases, and corresponding signaling pathways probably acting in a network.

#### *Post-translational modifications other than phosphorylation*

Toc159 was first identified as an 86 kDa protein lacking the Adomain (Hirsch et al., 1994; Kessler et al., 1994; Schnell et al., 1994; Bolter et al., 1998). It is not clear whether proteolysis occurs only during chloroplast preparation or whether it is part of regulatory system acting on Toc159. It is not clear either if other Toc159 homologs are also substrates of proteolytic cleavage but the relative stability of the A-domain fragment of atToc159 favors controlled proteolysis (Agne et al., 2010). Therefore, a yet unknown protease may process Toc159 conditionally, leading to the removal of the Adomain and consequently altering the import selectivity. Interplay between phosphorylation and cleavage has been demonstrated in other biological systems for example in the context of apoptosis (Dix et al., 2012). Investigation of the cross talk between these two PTM will certainly be an interesting aspect for future research.

Abundance of the different Toc members varies developmentally. Currently an important question is to understand how the TOC machinery is remodeled upon plastid development and plastid inter-conversion. As discussed above transcriptional regulation plays a role in modulation of Toc components expression depending on plant tissues and environmental conditions, while PTM may participate in the regulation of TOC complex assembly and activity. Recently a genetic study complemented by biochemical analyses revealed that Toc receptors as well as the Toc75 channel could be modified by ubiquitylation. Ubiquitylation required SP1, a chloroplast outer membrane localized E3 ubiquitin ligase (Ling et al., 2012). Enhanced accumulation of TOC proteins in *sp1* genetic background suggested that SP1 indeed participates in UPS-mediated degradation of Toc components. Phenotypic analyses indicated that this regulatory mechanism may play a role during plastid inter-conversion. However, how SP1 is regulated and functions selectively on the different Toc receptors has not been addressed so far. Again a possible interplay with phosphorylation regulation might be envisaged as phosphorylation can serve as either a positive or a negative signal for ubiquitylation (Hunter, 2007).

#### **CONCLUDING REMARKS**

Acquisition of the capacity to target proteins to different compartments has enabled eukaryotic cells to maintain and control the development of organelles. In higher plants the evolution of the TOC–TIC machinery has been a key mechanism enabling developmental processes. The evolutionary diversification of Toc receptors and transit peptides likely led to the tissue- and plastid type dependent preprotein selectivity of the import process. It is now well accepted that preprotein import in plastids plays a central role in the maintenance of cellular homeostasis, controlling the development and differentiation of this organelle. In a more indirect way, preprotein import also exerts control of nuclear gene expression via retrograde signaling to the nucleus. The composition and mode of action of the import machinery has been studied extensively in the past years, and now progress needs to be made toward the understanding of the regulatory mechanisms controlling the assembly and the activity of the complex. Regulation is not only important for correct sorting of preproteins, but also to limit energy expenditure associated with this costly process. Multiple types of PTM of Toc receptors have been discovered; however, their functional significance largely remains in the dark. Identification of the regulatory factors and signaling pathways as well as unraveling the biological relevance of the various PTM at the import machinery will provide new insight on how plants control development and adapt to the environment.

#### **ACKNOWLEDGMENTS**

This work was supported by UniNE, SNF31003A\_127380 and SNF31003A\_144156, and Marie HeimVoegtlin PMPDP3\_151301.

#### **REFERENCES**

Agne, B., Andres, C., Montandon, C., Christ, B., Ertan, A., Jung, F., et al. (2010). The acidic A-domain of *Arabidopsis* TOC159 occurs as a hyperphosphorylated protein. *Plant Physiol.* 153, 1016–1030. doi: 10.1104/pp.110.158048


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 July 2014; accepted: 01 September 2014; published online: 17 September 2014.*

*Citation: Demarsy E, Lakshmanan AM and Kessler F (2014) Border control: selectivity of chloroplast protein import and regulation at the TOC-complex. Front. Plant Sci. 5:483. doi: 10.3389/fpls.2014.00483*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Demarsy, Lakshmanan and Kessler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Targeting and assembly of components of the TOC protein import complex at the chloroplast outer envelope membrane

#### *Lynn G. L. Richardson1†, Yamuna D. Paila1†, Steven R. Siman 2, Yi Chen2, Matthew D. Smith <sup>2</sup> and Danny J. Schnell <sup>1</sup> \**

*<sup>1</sup> Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, MA, USA*

*<sup>2</sup> Department of Biology, Wilfrid Laurier University, Waterloo, ON, Canada*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Inhwan Hwang, Pohang University of Science and Technology, South Korea Steven Theg, UC-Davis, USA*

#### *\*Correspondence:*

*Danny J. Schnell, Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, Life Sciences Laboratories, Room N431, 240 Thatcher Way, Amherst, MA 01003-9364, USA e-mail: dschnell@ biochem.umass.edu*

The translocon at the outer envelope membrane of chloroplasts (TOC) initiates the import of thousands of nuclear encoded preproteins required for chloroplast biogenesis and function. The multimeric TOC complex contains two GTP-regulated receptors, Toc34 and Toc159, which recognize the transit peptides of preproteins and initiate protein import through a β–barrel membrane channel, Toc75. Different isoforms of Toc34 and Toc159 assemble with Toc75 to form structurally and functionally diverse translocons, and the composition and levels of TOC translocons is required for the import of specific subsets of coordinately expressed proteins during plant growth and development. Consequently, the proper assembly of the TOC complexes is key to ensuring organelle homeostasis. This review will focus on our current knowledge of the targeting and assembly of TOC components to form functional translocons at the outer membrane. Our analyses reveal that the targeting of TOC components involves elements common to the targeting of other outer membrane proteins, but also include unique features that appear to have evolved to specifically facilitate assembly of the import apparatus.

**Keywords: chloroplast, outer envelope membrane, protein targeting, translocon, TOC assembly, protein import**

*†These authors have contributed equally to this work.*

# **INTRODUCTION**

The plastids constitute a diverse array of organelles, which play central roles in plant growth, development, and defense by providing a remarkable range of metabolic and physiological capabilities in different cell and tissue types (Lopez-Juez and Pyke, 2005; Rolland et al., 2012; Jarvis and Lopez-Juez, 2013). The differentiation and maintenance of plastids rely on a complex interplay between plastid and nuclear genomes, requiring coordinate changes in the expression of nucleus- and plastid-encoded genes (Jarvis and Lopez-Juez, 2013). These events trigger modest changes in subsets of plastid proteins, such as the increase in chaperone expression in response to abiotic stress (Taylor et al., 2009), or result in near complete remodeling of the organelle proteome as occurs when non-green etioplasts undergo extensive biochemical and morphological changes to form chloroplasts during photomorphogenesis (Kami et al., 2010). Although these biogenetic events are initiated at the level of transcription, they are ultimately reliant upon the selective import of subsets of several thousand nucleus-encoded proteins into the organelles after synthesis in the cytoplasm (Li and Chiu, 2010; Jarvis and Lopez-Juez, 2013; Shi and Theg, 2013).

A multimeric complex in the plastid outer envelope membrane, referred to as TOC (translocon at the outer membrane of chloroplasts), recognizes the majority of plastid-destined proteins at the organelle surface (Kessler and Schnell, 2009; Chang et al., 2012). TOC components bind the N-terminal transit peptides of

newly synthesized preproteins and function in coordination with a second complex at the inner envelope membrane, referred to as TIC (translocon at the inner membrane of chloroplasts), to provide direct transport of preproteins from the cytoplasm to the stroma (Li and Chiu, 2010; Jarvis and Lopez-Juez, 2013). Plants express multiple isoforms of differentially expressed TOC complexes, each of which appears to preferentially mediate the import of subsets of proteins (Jarvis et al., 1998; Bauer et al., 2000; Kubis et al., 2003; Ivanova et al., 2004; Kessler and Schnell, 2009; Inoue et al., 2010; Bischof et al., 2011; Infanger et al., 2011). Both the levels and variety of different TOC complexes appear to be critical for maintaining organelle homeostasis during developmental and physiological changes (Jarvis et al., 1998; Bauer et al., 2000; Kubis et al., 2003; Ivanova et al., 2004; Kessler and Schnell, 2009; Inoue et al., 2010).

The dynamic role of TOC complexes in the recognition and discrimination of plastid-destined preproteins places the TOC machinery at a key hub in protein import and plastid biogenesis. The proper targeting and dynamic assembly of translocon components is key to ensuring that the capacity for protein import and the complement of functionally distinct translocons adjust in coordination with changes in gene expression. This review will focus on examining our current knowledge of the targeting and assembly of TOC components to form functional translocons at the outer membrane. Specifically, we will address aspects of TOC component targeting that conform to general principles of outer membrane targeting, as well as those features that are unique to TOC components that might have evolved specifically to facilitate assembly of the translocons. Previous reviews have included discussions of the mechanism of targeting of individual TOC components to the outer membrane (Li and Chiu, 2010; Kim and Hwang, 2013; Shi and Theg, 2013). We will include these studies, but with the intent of developing models for how the targeting of individual TOC components are coupled to the assembly of functionally diverse translocons that are key contributors to plastid biogenesis.

#### **OVERVIEW OF TOC FUNCTION**

The TOC machinery consists of a core complex containing two related GTP-dependent preprotein receptors, Toc34 and Toc159, which stably interact with a membrane channel, Toc75 (**Figure 1A**). Toc75, Toc34, and Toc159 are integral membrane proteins that form complexes in the outer membrane with a minimal size of 800 kDa and a stoichiometry estimated at 4:4:1 or 3:3:1 (Toc75:Toc34:Toc159) (Schleiff et al., 2003; Kikuchi et al., 2006; Chen and Li, 2007). Toc34 and Toc159 bind to the transit peptides of newly synthesized preproteins at the chloroplast surface via their GTPase domains (G-domains) and initiate translocation across the outer membrane by transferring preproteins to Toc75 through a series of intermolecular events controlled by their intrinsic GTPase activities (**Figure 1B**) (Kessler and Schnell, 2002; Li et al., 2007; Chang et al., 2012; Lee et al., 2013). Genetic and biochemical data indicate that transit peptide binding at the receptors regulates both homo- and heterodimerization between their cytoplasmic GTPase-domains, which in turn controls nucleotide exchange, hydrolysis and the initiation of preprotein translocation (Bauer et al., 2002; Smith et al., 2002; Jelic et al., 2003; Weibel et al., 2003; Becker et al., 2004; Yeh et al., 2007; Koenig et al., 2008a,b; Lee et al., 2009; Rahim et al., 2009; Oreb et al., 2011). Biochemical studies using pea chloroplasts and genetic studies in Arabidopsis demonstrate an essential role for Toc75 in plastid protein import (Perry and Keegstra, 1994; Schnell et al., 1994; Tranel et al., 1995; Ma et al., 1996; Baldwin et al., 2005; Hust and Gutensohn, 2006). Furthermore, structural and electrophysiological studies on reconstituted Toc75 demonstrate that the protein forms a cation-selective β-barrel channel, which is regulated by specific interactions with nuclear encoded preproteins (Hinnah et al., 1997, 2002).

The TOC GTPase receptors are encoded by multi-gene families in vascular plants, and different Toc159 and Toc34 family members assemble in combination with Toc75 to form structurally and functionally distinct core complexes (Jarvis et al., 1998; Kubis et al., 2003, 2004; Ivanova et al., 2004; Kessler and Schnell, 2009; Inoue et al., 2010; Bischof et al., 2011; Infanger et al., 2011; Yan et al., 2014). In Arabidopsis, the Toc159 family includes atToc132, atToc120 and atToc90, in addition to atToc159; and the Toc34 family includes atToc34 and atToc33. Biochemical studies demonstrate that different Toc GTPase receptors confer distinct preprotein selectivities on the core TOC complex (Jelic et al., 2003; Becker et al., 2004; Smith et al., 2004; Inoue et al., 2010).

Toc75, in a stoichiometry estimated at 4:4:1 or 3:3:1 (Toc75:Toc34:Toc159). Toc159 has a tripartite structure, consisting of cytoplasmically exposed acidic (A) and GTPase (G) domains and a C-terminal membrane anchor (M) domain. Toc34 contains a cytoplasmic GTPase (G) domain and a single α-helical transmembrane domain (TMD). preproteins to the G-domains of Toc159 and Toc34 (step 1). Binding of the transit peptide at the G-domains of the receptors triggers changes in receptor dimerization (step 2), which allow for GDP/GTP exchange (step 3). GTP hydrolysis results in the transfer of the preprotein into the Toc75 channel and the initiation of membrane translocation (step 4).

In addition, genetic studies coupled with transcriptomic and proteomic analyses suggest that the substrate preferences of different TOC complexes correspond to specific sets of chloroplast proteins whose expression is coordinately regulated in response to physiological or developmental changes (Jarvis et al., 1998; Kubis et al., 2003, 2004; Ivanova et al., 2004; Inoue et al., 2010; Bischof et al., 2011; Infanger et al., 2011). Collectively, these observations have led to the hypothesis that TOC complexes with different but overlapping substrate specificities are required to ensure balanced and efficient import of sets of coordinately expressed proteins during plastid biogenesis, differentiation and/or in response to stress.

# **TARGETING AND INTEGRATION OF Toc75**

Toc75 is a common element of all TOC complexes and it functions not only in general protein import, but also in the targeting and insertion of the TOC GTPases (**Table 1**). As such, it plays a central role in both the assembly and function of the translocon. Toc75 is encoded by a single gene in all plants examined (Inoue and Keegstra, 2003), and it belongs to the OMP85/TspB superfamily of β–barrel integral membrane proteins (Baldwin et al., 2005; Patel et al., 2008; Hsu and Inoue, 2009; Inoue, 2011; Schleiff et al., 2011). The members of this family are exclusively localized in the outer membranes of Gram-negative bacteria, mitochondria and plastids (Voulhoux and Tommassen, 2004; Schleiff et al., 2011), where they are proposed to play diverse roles in membrane transport and biogenesis. The most extensively studied examples include the bacterial BamA (β-barrel Assembly Machinery protein A) and mitochondrial Sam50 (Sorting and Assembly Machinery 50 kDa) proteins, which function in the integration of β-barrel proteins into the outer membrane in Gram-negative bacteria and mitochondria, respectively (Voulhoux et al., 2003; Gentle et al., 2004; Stroud et al., 2011). Mature Toc75 contains structural features characteristic of OMP85/TspB superfamily members, such as FhaC and BamA (Sanchez-Pulido et al., 2003; Gentle et al., 2005; Clantin et al., 2007; Noinaj et al., 2013), including an N-terminal ∼30 kDa region consisting of three repeats of POTRA (POlypeptide-TRansport Associated) domains that extend into the soluble space, and a ∼45 kDa C-terminal region constituting the membrane-integrated β-barrel.

Toc75 is translated as an ∼89 kDa precursor (pre-Toc75) and appears to be unique among chloroplast outer envelope membrane proteins (OEPs) in being targeted to the membrane *via* a cleavable N-terminal bipartite targeting signal (**Table 1** and **Figure 2**) (Tranel and Keegstra, 1996). The N-terminal region of pre-Toc75 can target chimeric fusion proteins to the chloroplast stroma, consistent with its function as a canonical transit peptide (Tranel and Keegstra, 1996). The targeting of Toc75 requires ATP and can be competed by the presence of other chloroplast preproteins during *in vitro* import, demonstrating that it employs the TOC translocon for membrane localization (Inoue et al., 2001). A glycine-rich region follows the transit peptide, and functions in the integration of Toc75 into the outer membrane (Tranel and Keegstra, 1996; Baldwin and Inoue, 2006). Pre-Toc75 is processed sequentially; once at amino acid 36, leading to an intermediate form (iToc75; ∼85.9 kDa), and again at amino acid 132 resulting in mature Toc75 (mToc75; 75 kDa). Pre-Toc75 is cleaved by the stromal processing peptidase (SPP), indicating that the Nterminus of the precursor reaches the stroma before being sorted


is synthesized with a bipartite targeting signal containing an N-terminal transit peptide followed in tandem by a poly-glycine region. The transit peptide targets pre-Toc75 to the TOC complex and initiates membrane translocation through the TOC and TIC channels. The stromal processing peptidase cleaves the transit peptide to generate an intermediate form (iToc75), and the poly-glycine region halts complete translocation of iToc75 in the intermembrane space. Two hypotheses have been proposed proposes that insertion is mediated directly by the TOC complex. Pathway (2) proposes that iToc75 is engaged by a β–barrel translocase of unknown composition, which catalyzes membrane insertion from the intermembrane space. OEP80 has been proposed as a core constituent of the β–barrel translocase. iToc75 is processed by a type I signal peptidase (SPase 1) to yield mature Toc75 during or shortly after insertion in the outer membrane.

to the outer membrane (Tranel and Keegstra, 1996). iToc75 does not reach the stroma and is arrested in the intermembrane space between the outer and inner envelope where it is cleaved by a type I signal peptidase (SPase 1), resulting in mature, functional Toc75 (Inoue and Keegstra, 2003; Inoue et al., 2005; Shipman and Inoue, 2009). Disruption of the gene encoding plastidic SPase I (Plsp1) results in the accumulation of immature forms of Toc75, a severe reduction of plastid internal membrane development, and a seedling lethal phenotype (Inoue et al., 2005; Shipman-Roston et al., 2010).

The glycine-rich region in iToc75 appears to be critical for arresting the import of the protein in the intermembrane space and triggering integration into the outer membrane (Inoue and Keegstra, 2003). It is unclear how the glycine-rich stretch prevents iToc75 from completing translocation across the inner envelope membrane via the TIC translocon, but two hypotheses have been proposed. Proteinaceous components in the intermembrane space or at the inner membrane bind to this region and hold the protein at the outer envelope membrane, or this region could prevent iToc75 from interacting with components that normally direct preproteins through the TIC system (Inoue and Keegstra, 2003; Baldwin and Inoue, 2006). Poly-glycine regions are not bound by DnaK, the bacterial molecular chaperone of the Hsp70 family (Okamoto et al., 2002), and it has been proposed that the iToc75 poly-glycine domain could function similarly to avoid molecular chaperones (e.g., Tic22) in the chloroplast intermembrane space that facilitate translocation of other preproteins across the inner membrane (Inoue and Keegstra, 2003; Baldwin and Inoue, 2006).

#### **TARGETING OF Toc75 IS DISTINCT FROM OTHER PLASTID β-BARREL PROTEINS**

There are several other plastid-localized members of the OMP85/TspB family, but all lack cleavable transit peptides (Reumann et al., 2005; Hsu and Inoue, 2009). Studies with one of these proteins, OEP80 (Outer Envelope protein, 80 kDa), suggest that it uses a TOC-independent pathway for localization to the outer membrane (Inoue and Potter, 2004). Similar to Toc75, OEP80 is predicted to contain three N-terminal POTRA domains and a C-terminal β-barrel (Hsu and Inoue, 2009), and the gene encoding OEP80 (also referred to as AtToc75- V, At5g19620) is essential in *A. thaliana* (Baldwin et al., 2005; Patel et al., 2008; Hsu et al., 2012). However, OEP80 is not detected in isolated oligomeric TOC complexes, indicating that it is not directly involved in TOC translocon function (Eckart et al., 2002). Although targeting of both Toc75 and OEP80 requires ATP hydrolysis, targeting of OEP80 is not competed by the presence of preproteins utilizing the TOC translocon, and processing of OEP80 was not detected in *in vitro* targeting studies using isolated chloroplasts (Inoue and Potter, 2004). Furthermore, deletion analysis *in vivo* demonstrated that the N-terminal ∼52 residues of Arabidopsis OEP80 are not required for the targeting, insertion, or functionality of the protein (Patel et al., 2008; Hsu et al., 2012). These studies indicate that the mechanism of pre-Toc75 targeting and insertion involves steps distinct from those of other plastid β-barrel proteins, such as OEP80.

In Gram-negative bacteria and mitochondria, the targeting of β-barrel proteins to outer membranes requires two translocons, one for targeting proteins into the inter-membrane space between the inner and outer membranes and the second for integrating proteins into the outer membrane (Hagan et al., 2011; Ricci and Silhavy, 2012; Wenz et al., 2014). In both cases, the first step is accomplished by the major protein translocation systems that mediate protein export and import in bacteria and mitochondria, respectively. Translocation of the nascent β-barrel precursors across the cytoplasmic membrane in bacteria occurs through the Sec translocon (Hagan et al., 2011; Ricci and Silhavy, 2012). In mitochondria, this step is mediated by import of newly synthesized nucleus-encoded β-barrel proteins from the cytosol through the translocase of the outer mitochondrial membrane (TOM) (Endo and Yamano, 2010; Dukanovic and Rapaport, 2011). The second step of inserting β-barrel precursors into the outer membrane is mediated by the β-barrel assembly machine (BAM) in bacteria (Tommassen, 2010; Hagan et al., 2011) and the sorting and assembly machinery (SAM) in mitochondria (Endo and Yamano, 2010; Dukanovic and Rapaport, 2011). BAM and SAM are evolutionarily conserved systems containing β-barrel channels, BamA in bacteria and Sam50 in mitochondria, which associate with additional sorting factors to catalyze the targeting and insertion of β-barrel proteins into the lipid bilayer.

Although Toc75, OEP80, BamA, and Sam50 are all members of the OMP85/TspB family, chloroplasts do not appear to possess other components of the β-barrel assembly machinery that are conserved in BAM and SAM complexes of bacteria and mitochondria (Hsu and Inoue, 2009). Consequently, the pathway for β-barrel protein integration at the plastid outer membrane remains to be fully defined. It has been proposed that OEP80 might constitute the core of the β-barrel sorting machinery in chloroplasts (**Table 1** and **Figure 2**) (Schleiff and Soll, 2005; Hsu and Inoue, 2009; Huang et al., 2011). Interestingly, the reduction of OEP80 expression by RNAi in Arabidopsis resulted in the reduced accumulation of Toc75 (Huang et al., 2011). This suggests a role for OEP80 in Toc75 biogenesis, perhaps at a step downstream from initial targeting of pre-Toc75 to the TOC translocon. However, direct evidence for OEP80 participation in β-barrel precursor targeting or insertion is still lacking.

#### **MODELS FOR TARGETING PRE-Toc75 TO THE OUTER MEMBRANE**

Based on existing evidence, several possible mechanisms for pre-Toc75 targeting and integration at the outer membrane have been proposed (**Figure 2**) (Schleiff and Soll, 2005). Both models include a role for the TOC translocon in the initial stages of pre-Toc75 targeting. The first step involves recognition of the N-terminal transit peptide of pre-Toc75 by the TOC GTPase receptors, followed by translocation into the TOC channel. The observation that the N-terminal transit peptide is processed by the stromal processing peptidase suggests that pre-Toc75 also engages the TIC complex and partially translocates across the inner membrane (Inoue et al., 2001); the glycine-rich segment is proposed to prevent complete translocation of iToc75 in the intermembrane space (Tranel and Keegstra, 1996; Baldwin and Inoue, 2006). At this point, the models diverge. In the first model, the TOC complex directly mediates the insertion of iToc75 into the outer membrane, and membrane integration is coupled directly to protein import (**Figure 2**). In this scenario, a separate translocase for outer membrane insertion, comparable to the BAM and SAM translocases in bacteria and mitochondria, would not be required. This model is consistent with the observation that Toc75 appears to be unique amongst chloroplast β-barrel proteins in utilizing the TOC-TIC system for targeting (Inoue et al., 2001). Furthermore, protease sensitivity experiments indicate that a significant proportion of iToc75 remains exposed to the cytoplasm during targeting, consistent with an intermediate that remains engaged by the TOC translocon during the sorting and integration process (Inoue et al., 2005).

In the second model, iToc75 would be engaged in the intermembrane space by a second translocase with an activity comparable to BAM or SAM (**Figure 2**). This translocase would function specifically to integrate β-barrel proteins into the outer membrane. OEP80 is a good candidate for a key component of a chloroplast β-barrel protein translocase, based on its sequence similarity to BamA and Sam50, and the fact that the reduction in OEP80 expression results in reduced accumulation of Toc75 in the outer membrane (Huang et al., 2011). Recent studies in yeast mitochondria demonstrate a close physical association between the TOM import complex and the SAM translocase during βbarrel sorting (Qiu et al., 2013). In a similar scenario, the TOC translocon and an OEP80 β-barrel translocase could cooperate during the sorting of iToc75. A small amount of Toc75 has been shown to immunoprecipitate with OEP80, indicating a potential dynamic interaction between the two proteins (Hsu et al., 2012). If OEP80 does represent a distinct β-barrel protein translocase, it remains to be determined why chloroplasts lack proteins in the intermembrane space similar to the other components of the BAM and SAM translocases that are conserved between Gram-negative bacteria and mitochondria.

The exact role of the transit peptide in Toc75 targeting remains unclear. One possibility is that transit peptide-mediated targeting functions to couple Toc75 sorting with the assembly of new TOC complexes. The transit peptide could ensure that pre-Toc75 remains closely associated with TOC translocons during integration into the outer membrane, regardless of whether or not integration is catalyzed directly by the TOC translocon or by an associated β-barrel translocase. A complex system requiring more than two processing steps for correct insertion of Toc75 would ensure high fidelity targeting of Toc75 to the outer membrane and perhaps facilitate the formation of new translocons (Tranel and Keegstra, 1996). It has recently been proposed that the topology of Toc75 was reversed during evolution resulting in orientation of its N-terminal POTRA domains in the cytoplasm (Sommer et al., 2011). This orientation is the reverse of other known βbarrel proteins in mitochondria and bacteria, and it is possible that the unique targeting pathway is important for determining the unique topology of the protein by orienting the polypeptide in an Nout topology during translocation and insertion in the outer membrane.

#### **TARGETING OF Toc34**

The Toc34 receptors are encoded by small gene families in many species. All Toc34 isoforms appear to be targeted to the outer membrane by the same mechanism, and therefore we will refer to them collectively as Toc34 and reference specific isoforms only when relevant. Toc34 is anchored in the outer envelope membrane via a C-terminal transmembrane domain (TMD), with its N-terminus (including the G-domain) exposed to the cytosol and a relatively short C-terminal sequence (CTS) oriented toward the intermembrane space (**Table 1** and **Figure 3**) (Li and Chen, 1996, 1997; Gutensohn et al., 2000; Dhanoa et al., 2010). Organellar proteins with this topology are collectively referred to as tail-anchored (TA) proteins (Kim and Hwang, 2013).

Toc34 and other outer membrane proteins with single transmembrane anchors lack a cleavable targeting signal, and the TMD and residues directly adjacent to the TMD are common features of their targeting signals (Lee et al., 2001, 2004, 2011; Hofmann and Theg, 2005; Dhanoa et al., 2010). In addition to the essential role of the TMD, the CTS following the TMD, and interactions with plastid-specific lipids are proposed to play a role in the specific targeting of TA proteins to plastids (Schleiff et al., 2001; Dhanoa et al., 2010). The selective targeting of TA proteins between chloroplasts and other organelles, including mitochondria, peroxisomes and the ER, also involves the degree of hydrophobicity of the TMD (Borgese et al., 2007; Lee et al., 2011), and *in vitro* targeting experiments with isolated organelles suggest that selectivity can occur at the surface of the organelle, independent of cytosolic targeting factors (Kriechbaumer and Abell, 2012).

#### **COMPONENTS AND ENERGETICS OF THE Toc34 TARGETING PATHWAY**

Toc34 and a second TA protein, OEP9, are recognized by the ankryin repeat cytosolic factor, AKR2, in the cytosol (**Table 1** and **Figure 3**) (Dhanoa et al., 2010). AKR2 appears to participate in the targeting of a variety of plastid, peroxisomal and ER proteins with single, N- or C-terminal TMDs, suggesting that it acts as a general chaperone for TMD-containing proteins by preventing inappropriate interactions during transit from the cytoplasm to boundary membranes (Bae et al., 2008; Dhanoa et al., 2010; Shen et al., 2010; Zhang et al., 2010). Recently a small heat shock protein, sHsp17.8 was identified that mediates specific association of AKR2 with chloroplasts, and enhances targeting of another chloroplast outer membrane protein, OEP7/OEP14 (Kim et al., 2011). It remains to be shown if sHsp17.8 also participates in Toc34 targeting. ARSA1, another cytosolic factor, was implicated in the targeting of Toc34 to chloroplasts in *Chlamydomonas reinhardtii* (Formighieri et al., 2013). ARSA1 is structurally related to the cytosolic targeting factor, GET3/TRC40, which facilitates targeting of TA proteins to the ER in yeast and mammals. *arsa1* mutants appear to selectively impact chloroplast biogenesis and not significantly affect the function of other cellular organelles. There are multiple isoforms of ARSA-like proteins in other plant species, suggesting that one or more ARSA homologs might function in Toc34 targeting in land plants.

The components at the outer membrane that mediate insertion of Toc34 remain to be fully defined. Insertion was initially proposed to be spontaneous (Schleiff and Klosgen, 2001; Jarvis and Robinson, 2004), based on the observation that Toc34 and OEP7/OEP14 were capable of associating with protein-free liposomes in the absence of nucleotide hydrolysis (Qbadou et al., 2003; Wallas et al., 2003; Dhanoa et al., 2010). However, the observations that the insertion of outer membrane proteins is promoted by nucleotide hydrolysis and inhibited by proteolytic treatments of chloroplasts argue for protein-mediated insertion (Hofmann and Theg, 2005). A proteinaceous receptor system specific for the targeting of outer membrane proteins has been proposed, but no components have been identified (Kim and Hwang, 2013). Considerable evidence suggests that Toc75 participates in the insertion of OEP14/OEP7 and similar proteins (**Figure 3**) (Tu et al., 2004; Hofmann and Theg, 2005). The similarities between Toc34 and OEP7/OEP14 targeting have led to the hypothesis that their targeting pathways share common components, including Toc75 (**Figure 3**) (Kim and Hwang, 2013).

and act as molecular chaperones to deliver the protein to the outer membrane. Three hypotheses exist for the recognition and insertion of Toc34 at the outer membrane. In pathway (1), the targeting complex is recognized

Toc75 without the assistance of outer membrane receptors. In pathway (3) binding and insertion of the Toc34 via Toc75 is assisted by interactions of its G-domain with existing Toc34. In all three pathways, Toc75 mediates insertion of the TMD into the membrane and Toc34 stably associates with Toc75.

#### **ROLE OF THE G-DOMAIN IN Toc34 TARGETING TO TOC COMPLEXES**

Although similarities exist between the targeting of Toc34 family members and other outer membrane proteins, several aspects of Toc34 membrane integration are unique and likely represent events that facilitate or are required for TOC assembly. For example, the GTPase activity of Toc34 was shown to stimulate its insertion into the outer envelope of isolated chloroplasts (**Figure 3**) (Chen and Schnell, 1997; Qbadou et al., 2003). Although the mechanism by which GTP-hydrolysis at Toc34 facilitates insertion remains to be investigated, the known interactions of the G-domain with other components of the TOC complex suggest that it may play a role in targeting and/or assembly of Toc34 into TOC complexes. Toc34 dimers interact via their G-domains (Sun et al., 2002), and these homotypic interactions might be involved in targeting of Toc34 to sites of TOC complex assembly (**Figure 3**). This hypothesis is supported by the observation that insertion of atToc33 and atToc34 is reduced relative to OEP9 in chloroplasts isolated from the atToc33 and atToc34 null mutants, *ppi1* and *ppi3*, respectively (Dhanoa et al., 2010). In an alternative model, Toc34 could interact directly with Toc75 during targeting, and GTPase activity could facilitate insertion and/or stabilization of the interaction with the channel (**Figure 3**). This model is consistent with the highly stable association between Toc34 and Toc75, reflected in the observation that Toc34 is found exclusively in TOC complexes (Kouranov et al., 1998) in a 1:1 stoichiometry with Toc75. Phosphorylation has been proposed to regulate the association of Toc34 with the translocon and thereby facilitate exchange of components from the complex (Oreb et al., 2008).

#### **TARGETING AND MEMBRANE INTEGRATION OF Toc159**

Members of the Toc159 family function as primary chloroplast preprotein receptors, and play fundamental roles in determining preprotein substrate specificity (Bauer et al., 2000, 2002; Ivanova et al., 2004; Kubis et al., 2004; Smith et al., 2004; Inoue et al., 2010). All Toc159 family members have a unique tripartite structure, consisting of an N-terminal acidic domain (A-domain) and a central GTPase domain (G-domain), both of which are exposed to the cytosol; and a C-terminal membrane anchor domain (Mdomain) that is protected from proteolysis and associates with the chloroplast outer envelope membrane through an unknown mechanism (Hirsch et al., 1994; Bauer et al., 2000; Ivanova et al., 2004; Lung and Chuong, 2012). A recent study using a yeast twohybrid approach demonstrated that the G-domains of Toc159 receptors bind to a wide range of preproteins, and the A-domain alters the relative affinity of each receptor for different classes of preproteins (Dutta et al., 2014). Expression of the M-domain of Toc159 alone can partially complement the seedling lethal phenotype of atToc159 null mutants in Arabidopsis, indicating its central role in formation of the functional translocon (Lee et al., 2003).

The majority of information on the targeting and function of the Toc159 family has been obtained by studying the most abundant isoform in green tissue—atToc159 and psToc159 from Arabidopsis and pea, respectively. For simplicity, we will refer to the family members collectively as Toc159. Evidence to date indicates that the mechanism of Toc159 targeting and insertion involves its G- and M-domains as well as other TOC components, including Toc34 and Toc75 (Wallas et al., 2003).

#### **ROLE OF THE G- AND M-DOMAINS IN Toc159 TARGETING TO TOC COMPLEXES**

The M-domain encompasses the ∼400 most C-terminal residues of the Toc159 protein family (Hirsch et al., 1994; Ivanova et al., 2004; Lung and Chuong, 2012). It is the minimal structural unit to confer protein import capability in Arabidopsis plants lacking full-length Toc159 (i.e., in the atToc159 null mutant, *ppi2*) (Chen et al., 2000; Lee et al., 2003), indicating that integration of this domain is a critical step in TOC complex assembly. Interestingly, while it is known that the M-domain spans the outer membrane and anchors Toc159 in the outer membrane based on its insensitivity to protease treatment and resistance to extraction in isolated chloroplasts, it does not possess any predicted hydrophobic transmembrane domains (Hirsch et al., 1994; Kessler et al., 1994; Muckel and Soll, 1996; Chen et al., 2000). This suggests that the nature of Toc159 membrane association is unique relative to other chloroplast outer envelope proteins.

When fused to an unrelated soluble protein, the M-domain is able to target the fusion protein to the chloroplast surface *in vitro* (Muckel and Soll, 1996), and the M-domain on its own targets to chloroplasts in a transient protoplast expression system (Lee et al., 2003), albeit inefficiently compared to native Toc159. Furthermore, the M-domain binds to isolated chloroplasts and interacts with the Toc34 G-domain *in vitro* (Wallas et al., 2003). These data demonstrate that the M-domain contains intrinsic targeting information for sorting to the outer membrane.

Recently, two Toc159 family members were identified in *Bienertia sinuspersici*, a species that carries out single-cell C4 photosynthesis by the presence of dimorphic chloroplasts in a single chlorenchyma cell (Lung and Chuong, 2012). A bioinformatics analysis of the C-terminal ∼100 residues (CTs) of the *B. sinuspersici* receptors, BsToc159 and BsToc132, revealed that this region has chloroplast transit peptide-like properties that are generally conserved in Toc159 homologs from other species (Lung and Chuong, 2012). These features include an overrepresentation of hydroxylated residues, and regions of predicted random coil and amphipathic-helical secondary structure (Lung and Chuong, 2012). Remarkably, this region functions as a transit peptide when fused to the small subunit of Rubisco in reverse orientation to maintain the topology of its interaction with the outer envelope (Lung and Chuong, 2012). These findings raise the intriguing possibility that the C-terminal region facilitates targeting of Toc159 to the outer membrane by mimicking a transit peptide to engage the TOC machinery.

The Toc159 G-domain also appears to contribute significantly to targeting of the receptor, and on its own binds to chloroplasts *in vitro* (Smith et al., 2002), indicating that it possesses intrinsic chloroplast targeting information. A Toc159 mutant deficient in both GTP binding and hydrolysis (Toc159mGTP) fails to complement a Toc159 null mutant (*ppi2*) in Arabidopsis (Bauer et al., 2002). Upon closer examination, it was found that the Toc159mGTP mutant has a reduced efficiency of binding and insertion into isolated chloroplasts (Bauer et al., 2002). The importance of GTPase activity for Toc159 targeting appears to be attributable to nucleotide binding and not hydrolysis itself. This conclusion is supported by the observation that expression of a Toc159 mutant, which binds nucleotide but is defective in GTP hydrolysis, complements the *ppi2* phenotype (Wang et al., 2008). Consistent with this premise, the GDP-bound form of Toc159 is inserted into isolated chloroplasts more efficiently than in its GTP-bound form (Smith et al., 2002). This has led to the hypothesis that nucleotide binding, in particular GDP binding, induces a conformation that renders Toc159 competent for targeting and integration into the outer membrane (Smith et al., 2002).

Toc159 binds to proteoliposomes containing either Toc34 or Toc75; however insertion into the membrane requires both Toc34 and Toc75 (Wallas et al., 2003). A direct role for Toc34 in Toc159 targeting is supported by the observation that Toc159 interacts with Toc34 via their respective G-domains (Hiltbrunner et al., 2001; Bauer et al., 2002; Smith et al., 2002; Wallas et al., 2003; Rahim et al., 2009); an interaction that is regulated by the GTPase activity of both Toc159 and Toc34 (Bauer et al., 2002; Wallas et al., 2003). In addition, when added to *in vitro* targeting assays, soluble Toc34 G-domain can compete for membrane insertion of Toc159 (Hiltbrunner et al., 2001). These data, in conjunction with the role of nucleotide binding on Toc159 targeting, suggest that interactions between the G-domains of the two receptors are important elements in efficient targeting of the receptor to the outer membrane.

Based on the demonstration that the A-domains of Toc159 family members are intrinsically disordered, it has been suggested that they might facilitate the assembly of TOC complexes (Richardson et al., 2009), a function that has been attributed to other intrinsically disordered proteins (Hegyi et al., 2007). The number of proteins with disordered regions positively correlates with the size of macromolecular complexes in yeast and *E. coli*, and it is hypothesized that large unstructured domains allow for simultaneous protein-protein interactions with multiple binding partners, or give flexibility to functional domains within complexes (Hegyi et al., 2007). In this manner, the A-domain of Toc159 could mediate transient interactions between multiple TOC components, either simultaneously or sequentially, during assembly of functional TOC complexes (Richardson et al., 2009). While the possibility is intriguing, definitive evidence for such a role has not yet been reported.

#### **WORKING MODEL OF Toc159 TARGETING AND INSERTION**

The existing data suggest that the intrinsic targeting information within the G- and M-domains of Toc159 act co-operatively to target the receptor to the chloroplast outer envelope (**Figure 4**). The targeting and subsequent insertion of the Toc159 Mdomain would be facilitated by a nucleotide-dependent interaction between the G-domains of Toc159 and Toc34. Consistent with this model, addition of the Toc159 G-domain *in trans* to isolated chloroplasts stimulated insertion of the M-domain into

the outer envelope (Wallas et al., 2003). The M-domain may initially engage the TOC complex through an interaction with Toc34 and/or Toc75. It is intriguing to speculate that the C-terminal region of the Toc159 M-domain takes advantage of the intrinsic transit peptide binding capabilities of Toc34 and Toc75 to specifically target the newly synthesized receptor to nascent TOC complexes (**Figure 4**) (Lung and Chuong, 2012). In this scenario, the initial targeting of Toc159 to the outer membrane would share elements with the binding of transit peptides at the TOC complex during the import of nuclear encoded preproteins. While the physiological relevance of this transit-peptide-like region to Toc159 targeting remains to be explored in more detail, this unusual targeting mechanism might have evolved to facilitate establishment of the unique membrane association of the Mdomain and the interactions of Toc159 with other components of TOC complexes.

It also has been reported that Toc159 exists as a soluble cytoplasmic receptor (Hiltbrunner et al., 2001; Becker et al., 2004; Lung and Chuong, 2012). While this observation may have interesting implications for Toc159 targeting, other data suggest that the soluble form of Toc159 represents a targeting intermediate en route to the chloroplast or a biochemical artifact generated during *in vitro* studies (Becker et al., 2004). Consequently, further studies need to be carried out to unravel the physiological significance of the soluble form and how it might relate to the reported Toc159-actin interaction (Jouhet and Gray, 2009a,b).

#### **THE SEQUENCE OF ASSEMBLY OF THE TOC COMPLEX**

On the basis of studies investigating the targeting and insertion of individual TOC components, we propose a model for the mechanism of TOC assembly. As described above, evidence supports a role for Toc75 in the targeting and integration of both Toc34 and Toc159 into the outer membrane (**Figures 3**, **4**). We propose that the integration of Toc75 into the outer membrane represents the first step in TOC formation (**Figure 2**). Pre-existing TOC complexes mediate the targeting of newly synthesized pre-Toc75 to the membrane, indicating that existing TOC translocons play a central role in the formation of new TOC complexes (Tranel and Keegstra, 1996). Full integration of Toc75 at the outer membrane could be mediated by the TOC translocon or in conjunction with a distinct β–barrel assembly machinery that could include OEP80.

Studies in pea demonstrate that the levels of Toc75 significantly exceed those of Toc34 and Toc159 at early stages in development when chloroplast biogenesis and division are maximal, and suggest that up to 50% of Toc75 exists in a form not associated with the GTPases (Kouranov et al., 1998). This would provide a sufficient pool of "free" Toc75 to nucleate the assembly of new TOC complexes. Based on the proposed role of Toc75 in the insertion of outer membrane proteins (Tu et al., 2004; Hofmann and Theg, 2005), we hypothesize that the channel functions in the integration of Toc34 (**Figure 3**). This hypothesis predicts that Toc75 would interact directly or indirectly with other components of the Toc34 targeting pathway (e.g., AKR2), and the channel formed by Toc75 could provide an interface to facilitate contact between the transmembrane helices of the proteins with the core of the lipid bilayer. Once integrated, Toc34 would remain tightly associated with Toc75.

Toc159 insertion at the outer membrane is dependent upon both Toc75 and Toc34 as demonstrated by reconstitution of the complex from individual components (**Figure 4**) (Wallas et al., 2003). Therefore, Toc159 is envisioned to be the final addition to the core TOC complex. As discussed above, the M-domain region could take advantage of the transit peptide binding properties of Toc34 and Toc75 to facilitate targeting of Toc159 to newly forming TOC complexes (Lung and Chuong, 2012). The interactions between the G-domains of the Toc34 and Toc159 could provide a recognition component in addition to the M-domain to enhance the efficiency of targeting of Toc159 to the membrane. In this scenario, the interaction of the GTPase domains also could play a role in maintaining the association of Toc159 with the other two TOC components during insertion of the M-domain into the membrane, thereby facilitating the formation of the core complex. The M-domain of Toc159 appears to be the minimal functional component of the receptor (Lee et al., 2003), and therefore the GTPase domains would play a critical role in the efficient formation of functional TOC complexes.

The proposed model, in which Toc75 serves as the nucleating core for the sequential addition of Toc34 and Toc159, also provides a mechanism to couple the targeting and membrane integration of TOC components directly with complex assembly (**Figure 4**). The interactions that maintain the stability and stoichiometry of the core complex, including functional interactions between the G-domains of the receptors and receptor interactions with Toc75, would be established in concert with targeting of the receptors from the cytoplasm to the membrane. Addition of Toc159 as the final component would complete assembly and serve to "activate" the translocon by integrating the functionally critical Toc159 M-domain into the complex (**Figure 4**). The model also proposes that Toc34 and Toc159 are the limiting factors in the formation of TOC complexes because of the excess availability of Toc75 in the membrane. Therefore, the levels of expression of the GTPases could determine the rate of formation of new translocons.

#### **TOC COMPLEX ASSEMBLY IN RELATION TO THE GENERATION OF FUNCTIONALLY DIVERSE TRANSLOCONS**

The existence of structurally distinct TOC translocons raises interesting questions regarding the mechanisms controlling their formation and abundance. Toc75 is the common element in TOC complexes that contain different combinations of Toc34 and Toc159 family members. The availability of excess Toc75 could provide a pool for rapid assembly of new TOC complexes with newly synthesized Toc34 and Toc159. Although the measurements of the size of TOC translocons are heterogeneous, estimates predict a minimal mass of ∼800 kDa, suggesting that each complex contains at least two molecules of Toc159 and six molecules each of Toc34 and Toc75 (Schleiff et al., 2003; Kikuchi et al., 2006; Chen and Li, 2007). Consequently, there is likely to be an active mechanism for the selective assembly of distinct receptors in specific combinations with Toc75. The analysis of TOC translocons in Arabidopsis has shown that two Toc159 family members, atToc159 and atToc132, preferentially assemble with two Toc34 family members, atToc33 and atToc34, respectively (Ivanova et al., 2004). The preferential interactions of the G-domains of the two receptors during targeting and assembly, both homotypic and heterotypic, could result in the assembly of translocons with specific compositions. It is also possible that the divergent A-domains contribute to the assembly of complexes with distinct compositions.

This model is unlikely to fully account for the assembly of distinct translocons. Expression of atToc33 or atToc34 complements the reciprocal Arabidopsis null mutants of either receptor (Jarvis et al., 1998), demonstrating that atToc159 and atToc132 can form distinct, functional translocons when assembled with either atToc33 or atToc34. Furthermore, some vascular plant species appear to only have a single Toc34 gene. For example, rice, a plant with a significantly more complex genome than Arabidopsis, is predicted to contain up to five Toc159 family members, but only one Toc34 ortholog. Therefore, the preferential interactions between Toc159 and Toc34 family members might play a role in the formation of distinct translocons in some, but not all plants species. The specific Toc159 isoform present within the TOC complex appears to be the minimal distinguishing feature of different translocons, consistent with genetic analyses demonstrating that atToc159, atToc132/atToc120 and atToc90 are functionally distinct (Ivanova et al., 2004; Kubis et al., 2004; Infanger et al., 2011). This raises the possibility that each Toc159 family member plays a role in the recruitment of the same or functionally similar isoforms, or exclusion of distinct isoforms into newly forming complexes. Homotypic interactions between the G-domains of Toc159 family members have not been examined in detail, but these interactions could contribute to recruiting the same receptor isoform once an initial Toc159 protein assembles with Toc75 and Toc34. The A-domains of Toc159 receptors play key roles in defining the selectivity of complexes for different import substrates (Agne and Kessler, 2010; Inoue et al., 2010), and it is intriguing to speculate that this domain could also participate in recruiting compatible or excluding incompatible Toc159 isoforms during TOC assembly.

Recent studies provide evidence that the abundance of TOC complexes is controlled by regulated proteolysis via the cytoplasmic ubiquitin-proteasome system (UPS), further highlighting the importance of regulating protein import at the level of the TOC translocon (Ling et al., 2012; Jarvis and Lopez-Juez, 2013). The UPS pathway is proposed not only to participate in the housekeeping turnover of TOC complexes, but also be important in selective degradation of specific TOC isoforms, thereby regulating the proper proportions of different TOC complexes to facilitate changes in the plastid proteome during developmental transitions (Ling et al., 2012; Huang et al., 2013; Jarvis and Lopez-Juez, 2013; Ling and Jarvis, 2013). The UPS mechanism of selective TOC degradation coupled with regulation of TOC assembly would provide an integrated system of controlling the levels of specific translocons in response to physiological or developmental demands.

#### **CONCLUDING REMARKS**

The translocons mediating the import of nuclear encoded preproteins play central roles in the biogenesis and functional differentiation of plastids. Major attention has been devoted to uncovering the mechanism of preprotein recognition and membrane translocation at TOC and TIC, and it is increasingly clear that the assembly and regulation of these complexes play an important role in organelle function and homeostasis. To date, studies on individual TOC components suggest that their targeting involves elements in common with other outer membrane proteins, but also reveal features that are unique to TOC biogenesis. Detailed studies on the characteristics and relationships of targeting pathways for the TOC proteins, the identification of the components of each pathway, and the definition of the roles of known components are needed to provide a complete picture of the mechanism and regulation of TOC assembly. A more complete picture of targeting and assembly in conjunction with information on the structures and interactions of core TOC components will undoubtedly shed light on key regulatory or quality control checkpoints in the assembly and dynamics of this unique macromolecular assembly. Finally, studies integrating TOC assembly with the newly discovered mechanism of TOC control by regulated proteolysis provides an opportunity to understand how the levels and diversity of the translocons are controlled, and thereby contribute to the plasticity of organelle function in response to developmental and physiological events in the cell.

#### **ACKNOWLEDGMENT**

This work was supported by National Institutes of Health Grant 2RO1-GM061893 (to Danny J. Schnell).

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 March 2014; accepted: 24 May 2014; published online: 11 June 2014. Citation: Richardson LGL, Paila YD, Siman SR, Chen Y, Smith MD and Schnell DJ (2014) Targeting and assembly of components of the TOC protein import complex at the chloroplast outer envelope membrane. Front. Plant Sci. 5:269. doi: 10.3389/fpls. 2014.00269*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Richardson, Paila, Siman, Chen, Smith and Schnell. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A new member of the psToc159 family contributes to distinct protein targeting pathways in pea chloroplasts

# *WaiLing Chang1,2,3, Jürgen Soll 1,2 and Bettina Bölter 1,2\**

*<sup>1</sup> Department Biology I, Plant Sciences, LMU München, Martinsried, Germany*

*<sup>2</sup> Munich Center for Integrated Protein Science CiPS, München, Germany*

*<sup>3</sup> Lysando GmbH, Regensburg, Germany*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Kentaro Inoue, University of California at Davis, USA Simon Gilroy, University of Wisconsin - Madison, USA*

#### *\*Correspondence:*

*Bettina Bölter, Department Biology I, Plant Sciences, LMU München, Martinsried, Germany, Grosshadernerstr. 2-4, 82152 Martinsried, Germany e-mail: boelter@bio.lmu.de*

Protein import into chloroplasts relies on specific targeting of preproteins from the cytosol to the organelles and coordinated translocation processes across the double envelope membrane. Here, two complex machineries constitute the so called general import pathway, which consists of the TOC and TIC complexes (translocon at the outer envelope of chloroplasts and translocon at the inner envelope of chloroplasts, respectively). The majority of canonical preproteins feature an N-terminal cleavable transit peptide, which is necessary for targeting and recognition at the chloroplast surface by receptors of TOC, where Toc159 acts as the primary contact site. We identified a non-canonical preprotein without the classical transit peptide, the superoxide dismutase (FSD1), which was then used in chemical crosslinking approaches to find new interaction partners at the outer envelope from pea chloroplasts. In this way we could link FSD1 to members of the Toc159 family in pea, namely psToc132 and psToc120. Using deletion mutants as well as a peptide scanning approach we defined regions of the preprotein, which are involved in receptor binding. These are distributed across the entire sequence; however the extreme N-terminus as well as a C-proximal domain turned out to be essential for targeting and import. En route into the plastid FSD1 engages components of the general import pathway, implying that in spite of the non-canonical targeting information and recognition by a specific receptor this preprotein follows a similar way across the envelope as the majority of plastid preproteins.

#### **Keywords: chloroplast, import, receptor protein, targeting, pea**

# **INTRODUCTION**

Plastids represent a large set of organelles with distinct physiological functions and morphologies found within all plant cells (Lopez-Juez and Pyke, 2005). The best studied plastid type are chloroplasts—photosynthetic organelles in plants and green algae that are responsible for converting energy from sunlight into carbohydrates and ATP. They are surrounded by a double membrane and possess their own genome, although the vast majority of genes encoding for chloroplast proteins has been transferred to the nucleus in the course of evolution. Analysis of the chloroplast genome revealed that the origin of chloroplasts can be traced back to a cyanobacterial ancestor that was engulfed by an ancient eukaryotic cell and eventually integrated as an organelle during evolution. To ensure a return passage of the proteins encoded by the relocated genes back to their compartment of origin where they carry out their functions, the development of a post-translational protein trafficking system became necessary. This post-translational protein import is mainly achieved by two multimeric protein complexes (also known as the general import complexes) located at the outer (TOC—Translocon at the Outer envelope of Chloroplasts) and inner (TIC—Translocon at the Inner envelope of the Chloroplasts) envelopes of chloroplasts, respectively (Seedorf et al., 1995; Jarvis, 2008). For many years, all proteins destined to the internal chloroplast compartments were believed to possess an N-terminal chloroplast targeting sequence (also known as the transit peptide), and to engage the TOC/TIC machinery. Toc75, Toc159, and Toc34 were among the first components of the chloroplast import machinery to be identified in pea chloroplasts (Schnell et al., 1994). Toc75 is a β—barrel protein constituting the protein translocation channel in the outer envelope membrane (Schnell et al., 1994; Hinnah et al., 1997; Keegstra and Cline, 1999). The receptor components Toc159 and Toc34 both associated with Toc75—are integral proteins at the outer membrane regulated by GTP binding and hydrolysis (Kessler et al., 1994; Hirsch and Soll, 1995; Seedorf et al., 1995). They belong to a superfamily of GTPases, characterized by a soluble GTPase domain exposed to the cytosol (G-domain) and a membrane anchoring domain (M-domain). The G-domains of the TOC GTPases exhibit classical motifs of nucleotide binding and hydrolysis characteristic to that of many GTPases. It is generally accepted that the TOC GTPase receptors control transit peptide recognition and the initial stages of membrane translocation by GTP-binding and -hydrolysis which lead to molecular rearrangements, but the molecular details that hinges GTPase activity with receptor function remain need to be further investigated. In addition to these two domains, Toc159 harbors an N-terminal region, which is highly acidic (A-domain) (Kessler et al., 1994). These GTPases are unique to plastids and are responsible for recognition of nuclear–encoded precursor proteins at the outer envelope. Together, Toc159, Toc34, and Toc75 form a stable core TOC complex sufficient for precursor protein translocation into artificial liposomes *in vitro* (Schleiff et al., 2003). Although most of the components of the import machinery were originally identified in pea chloroplasts, homologs are reported in moss (*Physcomitrella patens*) as well as in all higher plant species analyzed (Kalanon and McFadden, 2008). In some of them several components (particularly constituents of the TOC core complex) build multi-gene families. For instance, the Arabidopsis *(Arabidopsis thaliana)* genome encodes two paralogs of Toc34 (atToc33 and atToc34) (Jarvis et al., 1998; Gutensohn et al., 2000) and four paralogs of Toc159 (atToc159, atToc132, atToc120, and atToc90) (Bauer et al., 2000; Hiltbrunner et al., 2004; Ivanova et al., 2004; Kubis et al., 2004) and Toc75 (atToc75-III, atToc75- IV, atToc75-I, and atToc75V/atOEP80) (Baldwin et al., 2005). The presence of different isoforms of the core TOC complex members might allow remodeling of the import machinery in accordance to the biochemical requirements of the plastid dependent on the developmental stage and/or environmental conditions. psToc159 is most similar to atToc159 (with a sequence identity of 48%). Therefore, these two proteins are believed to be functional orthologs (Bauer et al., 2000). Characterization of the T-DNA insertion mutant of atToc159, *ppi2* (*plastid protein import 2*), showed that the differentiation of proplastids into chloroplasts was arrested, resulting in an albino phenotype (Bauer et al., 2000), i.e., the plants cannot develop photoautotrophically. The accumulation of photosynthesis-related proteins was found to be drastically decreased in the *ppi2* mutant, which was not the case for non-photosynthetic plastid proteins. This led to the proposal that import of non-photosynthetic proteins is mediated by other members of the atToc159 receptor family, namely atToc132 and atToc120 (Bauer et al., 2000; Kubis et al., 2004). These different receptors indeed assemble into structurally distinct translocation complexes that exhibit unique preprotein binding properties different to atToc159-containing complex (Bauer et al., 2000; Ivanova et al., 2004; Kubis et al., 2004). The three paralogs share high sequence similarity in their M- and G-domains, but are more dissimilar in the A-domain. Upon swapping of the A-domains the selectivity of the different receptors is altered (Inoue et al., 2010). While the presence of the A-domain clearly plays a role in determining substrate specificity, it is however not essential for plant survival (Lee et al., 2003).

Although only a few proteins have been described to use alternative translocation pathways so far, these examples nevertheless indicate that the TOC/TIC machinery is not exclusively responsible for transport of proteins into plastids. Both atToc132 and atToc120 were found to form a single TOC complex together, distinct from atToc159 (Ivanova et al., 2004). atToc33 was found to co-immunoprecipitates predominantly with atToc159, whereas atToc34 forms a complex together with atToc132/tToc120 (Ivanova et al., 2004). This observation led to the notion that the core TOC complex in Arabidopsis comprises either atToc159/33/75 or atToc132/120/34/75, where specificities are reflected by their individual receptor diversities (Ivanova et al., 2004). In both complexes, only one functional Toc75 homolog (atToc75-III) was detected (Ivanova et al., 2004).

Though distinct TOC complexes have so far been only reported in Arabidopsis, the presence of multiple genes encoding distinct TOC components can also be observed in other species. Bioinformatic analysis demonstrates the diversity in TOC receptors in *Spinacia oleracia* (spinach) (Voigt et al., 2005), *Oryza sativa* (rice) (Kubis et al., 2004), *Physcomitrella patens* (Hofmann and Theg, 2004) and *Picea abies* (spruce) (Fulgosi and Soll, 2002). The existence of the structurally and functionally distinct TOC complexes might therefore function as adaptation strategy during plastid differentiation in plants. In this context, multiple import pathways could allow efficient targeting and import of both 'housekeeping' and light regulated or other proteins which are required for specialized metabolic functions in the dynamic life of plastids (Inaba et al., 2005).

Recent studies of the Arabidopsis chloroplast proteome revealed the existence of several "non-canonical" chloroplast proteins, suggesting that they enter chloroplasts in an TOC/TIC—independent manner via internal, non-cleavable targeting sequences (Miras et al., 2002; Nada and Soll, 2004; Zybailov et al., 2008). We re-assessed the published data and chose several proteins strongly predicted to lack a cleavable transit peptide. By analyzing the *in vitro* import behavior of this subset of putative non-canonical proteins we found that most of them do feature cleavable targeting sequences in contrast to the *in silico* predictions. We could identify only a single protein with distinct import characteristics, which was then used as a bait to identify interacting components at the outer envelope membrane by chemical cross-linking. Thereby, we identified a new member of the Toc159 family in pea.

# **MATERIALS AND METHODS**

#### **PLANT MATERIAL**

Pea plants (*Pisum sativum* var. Arvica) were grown under a 16 h light/8 h dark regime at 21◦C. For chloroplast isolation, plants were harvested after 9–11 days from the dark (in the morning before the lights came on). Arabidopsis was grown under long day conditions with 100 μE at 20◦C.

#### **CONSTRUCTION OF PLASMIDS**

All genes for the import substrates were amplified from Arabidopsis cDNA with primers containing suitable restrictions sites. The PCR products were purified via a NucleoSpin Extract II kit (Macherey-Nagel, Düren, Germany) and submitted to restriction digest. These were ligated into restricted pSP65 vector. For heterologous overexpression, the cDNAs encoding FSD1 and psToc120A were cloned into pET21b. For generation of deletion constructs the FSD1/pSP65 plasmid was used as template with appropriate oligo nucleotides (see Supplemental Table 1).

#### **RNA EXTRACTION, RT- AND RACE-PCR**

Leaves from 5 to 7 days old pea plants were ground in liquid nitrogen and total RNA was isolated applying the RNA Easy Isolation kit (Qiagen, Hilden, Germany) according to the manufacturer's recommendation. cDNA was prepared from 1μg DNase-treated RNA using the BD SMART™ RACE cDNA Amplification Kit (Clontech, Germany). RT-PCR and 5 -RACE PCR were performed according to the manufacturer's instructions (BD SMART™ RACE cDNA Amplification Kit, Clontech, Germany).

### **TRANSCRIPTION AND TRANSLATION**

Transcription was performed in a final volume of 50μl in the presence of SP6 RNA polymerase. The mixture included polymerase buffer, 100 U RNase inhibitor, 10 mM DTT, 25μM BSA, 2.5 mM m<sup>7</sup> GpppG, 2.5 mM each ATP, CTP, UTP, and 1μg linearized plasmid. After 15 min of incubation at 37◦C, 1.2 mM GTP was added and RNA synthesis was continued for 2 h at 37◦C. The synthesized mRNA was used for translation in a final volume of 100μl. Flexi Rabbit Reticulocyte Lysate (Promega, Madison, USA) was supplemented with RNase inhibitor, amino acids without methionine, [35S]methionine/cysteine (5.4 MBq; PerkinElmer, Rodgau, Germany) and potassium acetate. The translation mixture was incubated for 1 h at 30◦C and centrifuged at 50,000 × g for 20 min at 4◦C. The supernatant was used for all import experiments. In some cases ATP was depleted from the translation product by using Micro Bio-Spin Chromatography Columns with Bio-Gel P-6 in Tris Buffer (Biorad, Munich, Germany) according to manufacturer's recommendations.

#### **ISOLATION OF INTACT CHLOROPLASTS AND PROTEIN IMPORT EXPERIMENTS**

Chloroplasts were isolated from 9 to 11 day old pea plants according to Waegemann and Soll (1996). Concentration of chlorophyll was determined as described (Arnon, 1949). Intact chloroplasts were incubated 30 min on ice in the dark to deplete ATP. A standard import reaction contained chloroplasts equivalent to 10μg chlorophyll in 100μl import buffer (330 mM sorbitol, 50 mM HEPES/KOH pH 7.6, 3 mM MgSO4, 10 mM methionine, 10 mM cysteine, 20 mM K-gluconate, 10 mM NaHCO3, 2% BSA (w/v) and 3 mM ATP (unless indicated otherwise) and up to 10% of [35S]-labeled precursor protein. The import reaction was conducted for 15 min at 25◦C unless indicated otherwise. Chloroplasts were re-isolated by centrifugation through a 40% Percoll cushion and washed twice in wash medium (330 mM sorbitol, 50 mM HEPES/KOH pH 7.6, 0.5 mM CaCl2). Imported proteins were separated by SDS-PAGE and analyzed by exposition on X-ray films (Kodak, Perkin Elmer, Rodgau, Germany). In some cases, prior to import chloroplasts were treated with thermolysin (1 mg/ml) for 20 min on ice. The protease treatment was terminated by adding 10 mM EDTA and chloroplasts were washed twice in EDTA containing washing buffer. For import, they were resuspended in import buffer. After import, chloroplasts were treated with 100μg/ml thermolysin for 20 min on ice, which was also quenched by adding 10 mM EDTA. For competition experiments, overexpressed proteins were added to the import mix in the indicated concentrations. The intensity of radioactive bands were analyzed with the ImageQuant programme (GE Healthcare).

#### **PROTEIN EXPRESSION AND PURIFICATION**

All recombinant proteins were expressed in *E.coli* BL21 (DE3) pLysS or BL21 (DE3) cells. Cells were grown at 37◦C in LB medium in the presence of 100μg/ml Ampicillin to an OD600 of 0.6. Expression was induced by addition of 1 mM isopropyl-1-thio-β-D-galactopyranoside (IPTG), and cells were grown for either 3 h at 37◦C or overnight at 12◦C. All soluble proteins were purified via their C-terminal His-tag using Ni2+-NTA Sepharose (GE Healthcare, Munich, Germany) under native conditions and eluted by increasing the imidazole concentrations. The proteins were concentrated and the buffer exchanged to 50 mM Tris/HCl, pH 8.0, 150 mM NaCl prior to use.

For purification of inclusion bodies (psToc120A), cells were lysed in lysis buffer (50 mM Tris/HCl pH 8.0, 150 mM NaCl, 5 mM β—mercaptoethanol) and centrifuged for 30 min at 14,000 rpm. The resulting pellet was resuspended in detergent buffer (20 mM Tris/HCl pH 7.5, 1% deoxycholic acid, 1% Nonidet P40, 200 mM NaCl, and 10 mM β—mercaptoethanol) and centrifuged for 10 min at 10,000 rpm. The pellet obtained was washed twice with Triton buffer (20 mM Tris/HCl pH 7.5, 0.5% Triton X-100, and 5 mM β—mercaptoethanol) and two times in Tris buffer (50 mM Tris/HCl pH 8.0, 10 mM DTT). The inclusion bodies were finally incubated in buffer A (8 M urea, 50 mM Tris/HCl pH 8.0, 100 mM NaCl, 2 mM β—mercaptoethanol) at RT for 1 h, centrifuged for 10 min at 10,000 rpm and the supernatant was incubated with Ni-sepharose fast flow (GE Healthcare) for 1 h at RT. The Sepharose was washed twice with washing buffer B (8 M urea, 50 mM Tris/HCl pH 8.0, 100 mM NaCl, 40 mM imidazole, 2 mM β—mercaptoethanol) and buffer C (8 M urea, 50 mM Tris/HCl pH 8.0, 1 M NaCl, 40 mM imidazole, 2 mM β—mercaptoethanol). Proteins were eluted by increasing the imidazole concentration to 400 mM. For further purification the protein was eluted from SDS-gels as described in 2.6.

#### **GENERATION OF ANTISERUM AGAINST PsToc120A**

For immunization of rabbits, inclusion bodies containing psToc120A were purified and separated on 10% SDS-PAGE. The overexpressed protein was excised from the gel and electro-eluted. It was subsequently dialyzed against 50 mM Tris/HCl (pH 7.0), 150 mM NaCl and used to immunize a rabbit, which was sacrificed after five consecutive boosts to obtain polyclonal antiserum.

#### **ISOLATION AND TRANSIENT TRANSFORMATION OF ARABIDOPSIS PROTOPLASTS**

Mesophyll protoplasts were isolated from leaves of 3 to 4-week-old Arabidopsis plants grown on soil. Leaves were cut into small pieces and incubated in 10 ml enzyme-buffer (1% Cellulase R10, 0.3% Macerozyme R10, 40 mM Mannitol, 20 mM KCl, 20 mM MES pH 5.7, 10 mM CaCl2, 0.1% BSA) in the dark for 90 min at 40 rpm. Protoplasts were released by shaking for 1 min at 80 rpm, filtered with a 100μm Nylon-membrane and centrifuged for 2 min at 100 × g. Protoplasts were resuspended in 500μl MMg buffer (400 mM Mannitol, 15 mM MgCl2, 4 mM MES pH 5.7), separated on a gradient made by 9 ml MSC buffer (10 mM MES, 20 mM MgCl2, 1.2% sucrose, pH 5.8) and 2 ml MMg buffer via centrifugation 10 min at 75 × g. Intact protoplasts were washed once in W5 buffer (150 mM NaCl, 125 mM CaCl2, 5 mM KCl, 2 mM MES pH 5.7) and resuspended in MMg buffer. 100μl protoplasts (about 4 × 10<sup>4</sup> protoplasts) were mixed with 10–50μg DNA (GFP-fusion constructs) and with 110μl PEG buffer (40% PEG 4000, Fluka, Germany), 200 mM Mannitol, 100 mM Ca(NO3)2 and incubated 15 min in dark. Protoplasts were diluted with 500μl W5 buffer and collected by centrifugation for 2 min at 100 × g. Protoplasts were resuspended in 1 ml W5 buffer and incubate at 25◦C overnight in dark. GFP fluorescence was observed with a TCS-SP5 confocal laser scanning microscope (Leica, Wetzlar, Germany).

#### **CHEMICAL CROSS-LINKING**

Outer envelope membranes were prepared from pea chloroplasts according to Cline et al. (1981) and incubated with a synthetic peptide representing the N-terminal region of FSD1 (amino acids 1–25) carrying a C-terminal biotin-moiety (JPT Peptide Technologies GmbH, Berlin, Germany) in the presence of 0.1 mM ATP at 4◦C for 5 min. After recovery of the membranes by centrifugation they were treated with 5 mM of N-(α-Maleimidoacetoxy) succinimide ester (AMAS, Thermo Fisher Scientific, Schwerte, Germany) for 30 min on ice to initiate cross-linking. Subsequently, the vesicles were solubilized with 1% dodecylmaltoside and incubated with a streptavidinsepharose matrix. Bound proteins were eluted by addition of Laemmli buffer (60 mM Tris-Cl pH 6.8, 2% SDS, 10% glycerol, 5% β-mercaptoethanol, 0.01% bromophenol blue) and boiling for 3 min. The eluates were separated by 12.5% SDS-PAGE followed by immuno-detection of the biotinylated hybrid protein using VECTASTAIN® ABC system (Biozol, Eching, Germany) or silver staining. The control contained everything except biotinylated peptides and was treated in exactly the same manner. Bands corresponding to specific cross-link products were excised from the gel and sent for analysis via LC-MS/MS. Briefly, the proteins were digested with trypsin in the gel and tryptic peptides were detected by LC-MS/MS. Protein identification was accomplished by a Mascot software assisted database search. Only hits displaying a threshold score of ≥ 60 were analyzed further.

#### **PEPTIDE ARRAY AFFINITY ASSAY**

Customized FSD1 peptide arrays were ordered from JPT Peptide technologies. Peptides were synthesized at 5 nmol/spot with acetylated N-termini and covalently bound by C-termini with a polyethylene glycol linker to the cellulose membrane. The recombinant A-domain of psToc120 (psToc120A) was analyzed in the affinity arrays. The peptide array was blocked with 0.3% skim milk in 1X TBS buffer for 1 h and subsequently incubated with 5μg/ml psToc120A (in 50 mM Tris/HCl (pH 7.0), 150 mM NaCl) overnight at 4◦C. The binding of the proteins to the peptides was detected after 3 h incubation with rabbit anti-psToc120A (1:500, 0.3% skim milk in 1X TBS) primary antibody and 1 h incubation with HRP-conjugated anti-rabbit (1:20,000, 0.3% skim milk in 1X TBS) secondary antibody. The membrane was washed three times for 10 min with 0.3% skim milk in 1X TBS after primary and secondary incubation. A negative control was performed by excluding psToc120A from the incubation protocol. The detection was performed with ECL Plus detection reagents. The intensities of the spots were analyzed with ImageQuant TL 8.1 software (GE healthcare, Munich).

# **RESULTS**

#### **SUB-CELLULAR ANALYSIS OF PUTATIVE "NON-CANONICAL" CHLOROPLAST PROTEINS**

As a first approach to characterize the molecular identity of the of the TOC/TIC-independent transport pathway, a subset of 9 tentative non-canonical chloroplast proteins was selected as baits to "fish" for the potential candidates of the TOC/TIC-independent transport machinery. The main criterion for the selection was based on their robust prediction for the lack of a chloroplast transit peptide (cTP). In order to increase the maximum accuracy of the cTP prediction, nine different prediction algorithms were employed. Only proteins that were predicted to lack a chloroplast transit peptide by at least six out of the nine prediction algorithms were selected (see Supplemental Table 2).

Though all the proteins in the test set have a reported chloroplast localization as well as a computationally predicted lack of transit peptide in previous proteomic studies (Ferro et al., 2003; Kleffmann et al., 2004; Zybailov et al., 2008), the chloroplast localization and processing of these proteins were re-accessed by *in vitro* import assays (**Figure 1**). In the *in vitro* import assays, detection of the protein prior to thermolysin treatment indicates that the protein has attached to the organelle (Thl −); if the signal persists after thermolysin treatment, it can be inferred that import has occurred (Thl +). Moreover, the presence of an additional smaller band is characteristic for post-import cleavage of the cTP. As control of the TOC/TIC-dependent and TOC/TIC-independent pathways, respectively, the precursor of the small subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (pSSU) and inner chloroplast envelope quinoneoxidoreductase homolog (AtQORH) were used. **Figure 1** depicts the results of import assays for four chosen Arabidopsis proteins which successfully translocated into pea chloroplasts and protease resistant after import. Three proteins were found to be processed to a smaller mature protein that was protease resistant (RAP38, PTAC3, and TSP9). Thus, in deviation from the computational prognosis these proteins do feature a cleavable transit peptide, indicating that the applied algorithms have still to be optimized. The exception was FSD1 (iron dependent superoxide dismutase, At4g25100); similar to the control protein AtQORH, FSD1 did not undergo a proteolytic maturation during import. Since the band visible after FSD1 import ran a little lower in the gel compared to the translation product, we checked the proteomic data (Ferro et al., 2010) for N-terminal peptides. As shown in Supplemental Figure 1 the extreme N-terminus only missing the start methionine was found in the proteomic approach. These observations strongly indicated that FSD1 had been imported without detectable proteolytic cleavage, a result that is consistent with the proteomic studies and the *in silico* analyses. To exclude that FSD1 is intrinsically protease resistant and that the band persistent after thermolysin treatment represents merely surface bound protein, we treated translation products of FSD1 as well as pSSU and AtQORH with thermolysin (Supplemental Figure S2). All proteins proved to be susceptible to thermolysin. Since FSD1 was the only protein in the set that was not processed after import, which indicates the use of a potential alternative translocation mechanism, we further analyzed its import characteristics and used it as a bait to

detect potential novel interaction partners at the outer envelope.

#### **IMPORT CHARACTERISTICS OF FSD1 INDICATE THAT IT USES SOME GENERAL COMPONENTS BUT SHOWS DISTINCT PROPERTIES**

Successive translocation of precursor proteins across the chloroplast membranes is an energy-dependent process involving the hydrolysis of nucleoside triphosphates at the outer envelope, in the intermembrane space and in the stroma (Pain and Blobel, 1987; Kouranov and Schnell, 1997; Young et al., 1999). Generally, a concentration of nucleoside triphosphates above 100μM is required for efficient translocation of a standard precursor protein across the envelopes and into the stroma (Theg et al., 1989). To screen for the energetic requirement of FSD1 translocation across the chloroplast envelope membranes, the endogenous nucleoside triphosphates were first depleted from the 35S-radiolabeled FSD1 translation products via gel filtration prior to the *in vitro* import assay. Likewise, to minimize the production of endogenous nucleoside triphosphates, chloroplasts were pre-incubated in the dark for 30 min on ice and the subsequent import reactions were carried out in the dark (Flügge and Hinz, 1986; Chen and Li, 1998). Consequently, only the influence

of the externally added nucleoside triphosphates on the import characteristics was investigated. As a control for the energy state of stromal-localized proteins during the import reaction, the import of 35S-radiolabeled pSSU was also monitored.

As depicted in **Figure 2**, import of FSD1 is considerably diminished in the absence of ATP (lane 2), whereas the addition of exogenous ATP clearly resulted in a yield increase (lanes 3–6). Maximal import is reached at 1 mM ATP (100%) whereas no addition gives a yield of ∼30% ± 5%. A similar import behavior was observed for pSSU (initial import ∼16% ± 2.5%). In both cases a very slight yield decrease in the presence of 3 mM ATP compared to 1 mM ATP can be observed (lanes 5/6), but since this was not a consistent effect in all experiments it seems not to be specific (data not shown). The import of the inner envelope protein AtQORH, on the other hand, was not influenced by the absence of nucleoside triphosphates. Without external ATP the import yield is already higher than 70 ± 13%. Thus, it can be concluded that the import of FSD1 is energy dependent comparable to a general import pathway substrate.

Import of canonical precursor proteins into chloroplasts requires protease-sensitive components at the outer envelope membrane (Perry and Keegstra, 1994). To investigate whether this is the case for FSD1, import experiments were performed with purified pea chloroplast after thermolysin pre-treatment (**Figure 3**). Import reactions of precursor proteins that require intact proteinaceous components should be inhibited after protease treatment. The thermolysin pre-treatment was assessed by immunoblotting, which showed that surface exposed domains of the receptor proteins Toc159 and Toc34 were proteolytically removed whereas the inner envelope protein Tic110 remained intact (**Figure 3A**). Chloroplasts from the same batch as applied for the immunoblot analysis were used for the import assays. FSD1 and the control proteins AtQORH and pSSU were incubated with chloroplasts corresponding to 15μg chlorophyll at 25◦C for 10–12 min for FSD1 and AtQORH, and 5 min for pSSU (**Figure 3B**). Protease pre-treatment resulted in a noticeable decrease in binding and import of pSSU. Intriguingly, the translocation of AtQORH was also diminished, though it had been shown that it does not require the main TOC receptor

proteins (Miras et al., 2002). It seems that although AtQORH is still imported into protease treated chloroplasts, a protease sensitive component enhances import efficiency. FSD1 import was completely abolished, demonstrating that FSD1 depends on protease sensitive receptors on the chloroplast surface for the initial recognition step and its subsequent translocation into chloroplasts.

To explore if FSD1 makes use of some parts of the general import machinery and/or shares components with AtQORH on its way into the chloroplast, competition experiments with heterologously expressed pSSU and FSD1 were conducted (**Figure 4**). In a first set of experiments, *in vitro* import experiments were carried out in the presence or absence of different amounts of purified pSSU (**Figure 4A**). As internal control for the general- and TOC/TIC-independent pathways imports of pSSU and AtQORH were tested in the presence or absence of the unlabeled competitor. The data presented in **Figure 4A** clearly illustrate that the amount of imported 35S-labeled pSSU decreased with increasing concentrations of unlabeled pSSU in a concentration dependent manner by 60–90% ± 6%. The import of 35S-radiolabeled AtQORH was only slightly affected in presence of excess pSSU (5 and 10μM of competitor leads to less than

30 ± 3% of inhibition; lanes 5, 6), when translocation of pSSU itself was already completely abolished (**Figure 4A**). These observations suggested that AtQORH did not engage the trimeric TOC159/75/34 complex for translocation into the chloroplasts, a result that is consistent with previous work (Miras et al., 2002, 2007). FSD1, on the other hand, confers a clear sensitivity to the presence of the pSSU. At 1μM concentration of competitor, import of FSD1 is notably diminished (∼40 ± 3% inhibition), but in contrast to pSSU the translocation of FSD1 is never completely blocked, even at the highest competitor concentration as judged by the appearance of the mature FSD1 after protease treatment (∼50 ± 3%) (**Figure 4A**). Thus, some translocation components used by FSD1 seem to be blocked by the excess of pSSU, but import could still occur. This could be due to blockage of a single component that both pSSU and FSD1 engage upon import, e.g., Toc75, or of more than one part of the translocon(s) which might be used by both preproteins.

Evidence for the operation of distinct import pathways was previously reported for the import of another inner envelope protein lacking a cleavable transit peptide, psTic32 (Nada and Soll, 2004). Competition experiments were used to address the question of whether FSD1, AtQORH and psTic32 share components of the same import pathway (**Figure 4B**). For this purpose, heterologously expressed purified FSD1 was added to import reactions containing radiolabeled AtQORH or psTic32. As depicted in **Figure 4B**, strong competition only occurred in the case of FSD1 itself (∼15 ± 3% residual import). From the fact that the import of AtQORH (∼60 ± 20%) and psTic32 (75 ± 10%) still occurred in the presence of excess FSD1 it can be concluded that FSD1 translocation does not engage components involved in either psTic32 or AtQORH translocation. The classical precursor protein pSSU shows a slight import inhibition in the presence of recombinant FSD1, which has similar properties as observed in the reciprocal experiment: addition of 1 μM of FSD1 results in an inhibitory effect (80 ± 19% residual import) that is not intensified by higher amounts of competitor. This again implies that pSSU and FSD1 share at least one translocation components on their way into chloroplasts.

#### **THE N-TERMINUS OF FSD1 SPECIFICALLY INTERACTS WITH LARGE OUTER ENVELOPE PROTEINS**

For most internal chloroplast proteins the N-terminal sequence contains essential targeting information: classical precursor proteins comprise cleavable N-terminal transit peptides, but also Tic32 has its non-cleavable targeting signal in the N-terminus. The only known exception is AtQORH, which is guided by internal sequence information (Miras et al., 2002). Since FSD1 clearly revealed some import characteristics similar to canonical precursor proteins, we decided to address the targeting properties of the extreme N-terminus of FSD1. To this end, we designed a synthetic peptide consisting of amino acids 1–25 including a C-terminal biotin-moiety and applied this to identify possible binding partners of FSD1 at the outer envelope. The peptide was used as bait in a chemical cross-linking approach utilizing right side out outer envelope vesicles (Waegemann et al., 1992) at the energyindependent 'binding' stage, reflecting the conditions upon preprotein recognition at the chloroplast surface. The N-terminal FSD1-biotin peptide was used in the presence of 0.1 mM ATP and incubated at 4◦C for 5 min. Afterwards, membranes were treated with 5 mM N-(α-Maleimidoacetoxy) succinimide ester (AMAS) for 30 min on ice and then solubilized with 1% dodecylmaltoside. The suspension was incubated with a streptavidin matrix and bound protein complexes were eluted by adding Laemmli buffer. The eluates were separated by 12.5% SDS-PAGE followed by immuno-detection with a α-biotin antibody or silver staining. As depicted in **Figure 5**, several cross-linked products were observed in the silver stained gel, whereas one prominent band was recognized by the biotin-antibody. No cross-reactivity was observed with an empty streptavidin-sepharose matrix as shown in the control reactions (**Figure 5**, control). The indicated band (**Figure 5**, asterisk) was excised from the gel and used for analysis via automated nano-spray LC-MS/MS.

The identified peptides matched 100% with peptides deduced from partial cDNA sequences in the pea database (Franssen et al., 2011) (see **Table 1**), annotated as psToc120 (Ps200709\_Contig058064, Ps200709\_Contig051235, Ps200709\_Contig017379, Ps200709\_Contig017379 and Ps\_contig \_mira-and-tgicl-ass\_9502, Ps\_contig\_mira-and-tgicl-ass\_37108, Ps\_singlet\_mira-and-tgicl-ass\_6441, Ps\_singlet\_mira-and-tgicl-a ss\_7910, whereas psToc132 corresponds to Ps200709\_Contig0205 40, Ps200709\_Contig020541, Ps200709\_Contig051331, Ps200709 \_Contig060418, Ps200709\_Contig060420, Ps200709\_Contig0629 10 and Ps\_singlet\_mira-and-tgicl-ass\_6440, Ps\_singlet\_miraand-tgicl-ass\_7911.

Though it has been shown in Arabidopsis that different paralogs of Toc159 assemble into distinct TOC complexes that accept different precursor proteins, no biochemical evidence has been brought forward thus far demonstrating the existence of such a gene family in pea. Therefore, the identification of peptides representing psToc132 and psToc120 from the MS data raised the possibility that a similar multigene family could also be present in pea.

Since the identified peptides were located in the middle of the putative psToc120/psToc132 homologs, we performed 5 -RACE PCRs with degenerated oligonucleotides matching the peptides to isolate N-terminally complete cDNAs from pea. The 5 -RACE

**proteins.** A synthetic peptide representing the first 25 AA of FSD1 including a biotin moiety was incubated with outer envelope vesicles, chemically cross-linked and subjected to SDS-PAGE. The samples were analyzed by silver staining and western blot. A band not present in the control was analyzed by mass spectroscopy (indicated by an asterisk).


PCR amplification for psToc120 produced a product with the size of 1465 bp, whereas the reaction for psToc132 resulted in a product of 1437 bp. The first cDNA clone contained an open reading frame of 1157 bp which encodes for the 391 amino acids, the putative A-domain of psToc120 with a calculated molecular mass of 43.4 KDa (Supplemental Figures 3, 4). Sequence comparison of the deduced amino acid sequence showed 35.5% sequence similarity to atToc120. The nucleotide sequence of the second cDNA clone showed 38.6% sequence similarity to the A-domain of atToc132. This clone, however, lacked a true start methionine, therefore was most likely N-terminally incomplete. In addition, the abundance of the acidic amino acids glutamic acid and aspartic acid as well as the hydroxyl-containing serine and threonine that has been proposed as characteristic features of the A-domain in Arabidopsis Toc159 homologs (Agne et al., 2010; Chang et al., 2012) was also found in both putative A-domain sequences of psToc132 and psToc120. Therefore, both sequences represent genuine A-domains of psToc132 and psToc120 in pea, respectively.

#### **FSD1 IMPORT IS INHIBITED BY THE PRESENCE OF psToc120 A-DOMAIN**

The coding region for amino acids 1–391 of psToc120, representing the A-domain, was cloned into pET21d with a C-terminal His-tag and heterologously expressed in *E.coli* BL21 (DE3) pLysS. The same approach was initiated for psToc132 A-domain but remained unsuccessful since no heterologous expression could be achieved under various conditions. Thus, we focused on the characterization of psToc120. The purified protein (psToc120A) was used to generate a specific antiserum and as a competitor in import assays. To the latter end, increasing amounts of recombinant psToc120A were added to *in vitro* import assays containing radiolabeled FSD1, AtQORH or pSSU, respectively. A concentration dependent inhibition of FSD1 translocation was observed in the presence of psToc120A (**Figure 6**, lanes 3–5): at 1μM psToc120A FSD1 import decreased by ∼10 ± 2%, at 2μM psToc120A by ∼45 ± 2% and at 5μM psToc120A inhibition was already ∼70 ± 2%. This most likely indicates that the A-domain of psToc120 binds to FSD1 and thus blocks binding to the intrinsic receptor at the chloroplast surface. A second possibility could be that the heterologously expressed A-domain associates, e.g., with the second receptor Toc34, thus preventing effective translocation of the precursor protein into the chloroplast. In contrast, pSSU import remained unchanged. Intriguingly, a similar inhibitory effect as for FSD1 was observed for AtQORH: at 1μM psToc120A inhibition was ∼20 ± 8%, at 3μM and 5μM psToc120A ∼50 ± 8%. These data suggest that psToc120 might act as a common receptor for both FSD1 and AtQORH, though AtQORH is still imported to a certain extent after protease pre-treatment in contrast to FSD1 (**Figure 3B**). This could be due to AtQORH having further protease resistant receptors at the chloroplast surface or a much higher affinity to its import channel, so that it could bypass the receptor.

#### **THE EXTREME N-TERMINUS OF FSD1 IS NECESSARY BUT NOT SUFFICIENT FOR IMPORT, WHEREAS THE EXTREME C-TERMINUS IS DISPENSABLE**

Since the N-terminal 25 amino acids of FSD1 were found in a complex with a TOC receptor protein and primary amino acid sequence comparison between FSD1 and its prokaryotic homologs revealed that the N-terminal region (eight amino acids in at FSD1, 40 amino acids in osFSD1 and 32 amino acids in the algal FSD1) is found exclusively in plant FSD1 proteins (Supplemental Figure 5) we strongly suspected that the Nterminus of the plant FSD1 proteins could be required for specific chloroplast targeting. For further examination of the domains or regions mandatory for plastid localization of the FSD1 protein, several truncated versions of the FSD1 protein in which 10–30 amino acids were progressively removed from the N-terminus were generated. Full-length FSD1 and the deletion mutants were

**FIGURE 7 | The extreme N-terminus of FSD1 is indispensable for import.** *In vitro* import assays into pea chloroplasts are shown. Lane 1 represents 10% of translation product, Thl (+) indicates thermolysin digestion after the import reaction (lane 3). Please note the second band in the translation product of FSD1DN (indicated by an asterisk) which might derive from a downstream methionine recognized in the reticulocyte lysate as an alternative start. Representative results from *n* = 3 independent experiments are depicted.

depicted.

subsequently synthesized *in vitro* and used for import studies. The shortened proteins showed a remarkable reduction in the extent of chloroplast binding. With the exception of the full-length FSD1 all truncated proteins remained protease accessible, demonstrating that no productive translocation into chloroplasts had occurred. Deletion of the first 10 amino acids already abolished import (**Figure 7**, FSD1-N10) indicating that the N-terminus of FSD1 constitutes an indispensable part in signal recognition and targeting to the chloroplast. The same observation was made for FSD1-N20 and FSD1-N30 (**Figure 7**). To investigate whether the first 10 amino acids at the N-terminus of FSD1 are not only necessary but sufficient to drive translocation of FSD1 to plastids *in vivo*, the localization of partial or full-length FSD1-GFP fusion constructs were monitored via transient transformation of Arabidopsis mesophyll protoplasts. Intriguingly, while the 10 amino acids at the N-terminus of FSD1 are essential for plastid recognition, they were not sufficient to direct the GFP construct into plastids (**Figure 8**). For all chimeric constructs except the

full-length FSD1 or the indicated number of N-terminal amino acids.

both (merge). The used constructs are depicted on the left. *AA*, amino acids; bar, 5μm.

full length FSD1-GFP a cytosolic localization of the fluorescent signal was observed. This demonstrated that the N-terminal part of FSD1 (30 first residues) is not sufficient for plastid localization of a reporter protein. To establish if the C-terminus also contains important targeting information, we generated C-terminal deletion constructs where 10, 20, or 30 amino acids were missing, FSD1-C10, FSD1-C20, and FSD1-C30, respectively (**Figure 9**). None of these proteins were affected in translocation efficiency, indicating that the extreme C-terminus of FSD1 is not necessary for targeting or import into chloroplasts. Please note that upon import of the C-terminal deletion constructs two protease resistant bands appeared (indicated by asterisks), which could be due to unspecific proteolytic cleavage of the non-native substrate.

#### **psToc120A HAS SPECIFIC RECOGNITIONS MOTIFS IN FSD1**

The results indicated that the N-terminal region of FSD1 is indispensable for targeting, but not sufficient to drive translocation of the fusion protein into chloroplasts (**Figures 7**, **8**), whereas the extreme C-terminus is not necessary for import (**Figure 9**). This implies that other regions of FSD1 contain additional sequence information that is vital for receptor recognition and/or for the process of protein import. In order to identify the regions in FSD1 that interact with psToc120, a PepSpot assay was designed. The FSD1 peptides spotted on the membrane were 15 amino acids in length, with each subsequent peptide on the array overlapping by 12 amino acids with the previous one. In total, 67 peptides covered the full length sequence of FSD1. This peptide array was incubated with recombinant psToc120A and the binding specificity of the receptor was detected with specific primary and HRP-conjugated secondary antibodies. As revealed in the peptide scan analysis, psToc120 demonstrated a high relative binding specificity across the array (**Figure 10**). A negative control that was performed in the absence of recombinant psToc120A showed no unspecific binding of the antibodies to the peptides. The psToc120 receptor interacted with several consecutive stretches of peptides that are distributed across the entire sequence of FSD1 (**Figure 10B**, motifs one to six). The amino acid sequences of the psToc120-binding peptides and the minimal binding motif present in each of the peptides are depicted in **Figure 10B**. Notably, the identified binding motif of 10-VTANYVLKPPPFALDALE-24 matches the sequence of the N-terminal FSD1-biotin hybrid protein used in the cross-linking assay (see above). This short peptide sequence is most probably indispensable in the initial targeting and recognition of FSD1 by the psToc120 receptor at the chloroplast surface while binding motifs at the C-terminus might be involved in conferring specificity to FSD1 targeting and its subsequent translocation into the chloroplast.

To further confirm the results from the PepSpot assay we generated several truncations of FSD1 for *in vitro* import experiments. These were lacking either only recognition motif six (amino acids 1–178 were still present) or motifs five and six (amino acids 1–131 present), respectively (**Figure 11A**). Import assays of these proteins revealed that deletion of motif six does not diminish import, which is well in line with the results from the C-terminal deletions displayed in **Figure 9**. In contrast, after

import of an FSD1 construct lacking motif five only weak protease resistant bands could be detected (**Figure 11B**), demonstrating an obvious inhibition of translocation efficiency. This corresponds perfectly to the intensity observed in the PepSpot assay where binding was strongest to the region defined as motif five. These observations together with the previous data from the *in vitro* interactions studies with the recombinant psToc120A receptor strongly point to a *bona fide* interaction between psToc120 and FSD1 in specific regions of the substrate protein.

# **DISCUSSION**

The general import pathway is the route taken by the majority of chloroplast proteins. During recent years it turned out, however, that several chloroplast residents travel a different path to reach their destination. Not much is known about these alternative import ways, and only very few substrates have been analyzed so far. To identify substrates for a to date elusive alternative import pathway, we made use of published proteomic studies (Zybailov et al., 2008) from which we chose several proteins with confirmed stromal localization and a strong prediction against the presence of a classical cleavable transit peptide. Nine of these proteins were analyzed with regard to their *in vitro* import behavior and four were found to be amenable to this assay (**Figure 1**, Supplemental Table 1). To our great surprise, three of the four proteins proved to be processed upon translocation into chloroplasts, which was in stark contrast to the computational predictions. Though we did not investigate the import characteristics of these precursor proteins any further, it seems quite likely that they employ the general import pathway. This clearly shows that the algorithms applied for transit peptide prediction are not yet completely reliable and need to be optimized, which will be a challenging task for the future.

We could, however, identify one protein which was not cleaved upon import and thus constituted a promising substrate for our endeavor. We proceeded to appraise the import characteristics of FSD1 with regard to targeting information contained

within the mature sequence, energy dependence and the engagement of known translocation components. It turned out that FSD1 shares some features with substrates of the general import pathway, such as the necessity of the extreme N-terminus for targeting and import (**Figures 7**, **8**), the need for proteasesensitive receptors at the plastid surface (**Figure 3**) as well as the ATP-dependence of translocation (**Figure 2**). Though the exact minimal ATP concentration necessary for translocation varies between the different preproteins, import of both pSSU and FSD1 is clearly stimulated at concentrations higher than 0.1 mM ATP. Furthermore, import of FSD1 greatly decreased in the presence of recombinantly expressed pSSU, which travels via TOC/TIC into chloroplasts (**Figure 4A**). All this infers that FSD1 might engage at least one component of the general import pathway. When we tested for constituents shared with Tic32, another non-canonical chloroplast protein which import mechanism is still elusive, we found that these two proteins obviously travel independent routes, since an excess of FSD1 did not hamper Tic32 translocation (**Figure 4B**). The control protein AtQORH that was used for a partially characterized non-canonical import pathway behaved differently from FSD1 in almost all assays. This suggests that with FSD1 we discovered a third class of substrate protein, which uses Toc120 and most likely other Toc components such as Toc75 (in fact, we also identified peptides from Toc75 in the excised band from the cross-linking approach depicted in **Figure 5**), but represents the only preprotein without a cleavable transit peptide identified to date.

To pinpoint specific components of the translocon responsible for the passage of FSD1 into chloroplasts, we applied a chemical cross-linking approach using a synthetic peptide which represented the first 25 amino acids from FSD1 and carried a C-terminal biotin-moiety. This enabled us to detect crosslinked products by using the biotin-streptavidin system. Since the peptide was too short to be fully translocated we opted for focusing on the binding events at the outer envelope membrane. To this end, the peptide was bound to isolated outer envelope vesicles, cross-linking was initiated and the complexes containing the bound peptide finally pulled out by incubation with a streptavidin matrix (**Figure 5**). A band that appeared exclusively in the cross-linked sample was excised from the gel and sent for MS analysis. Thorough scrutinizing of the resulting peptide masses revealed that the FSD1 peptide bound to two proteins with similarities to the A-domains atToc132 and atToc120, respectively. These proteins belong to the family of atToc159 receptor GTPases and have been shown to build complexes different

from atToc159/atToc33/atToc75. Originally, it was speculated that atToc159 is responsible for binding only precursor proteins involved in photosynthetic processes, whereas atToc132/Toc120 are more involved in accepting proteins fulfilling house-keeping functions (Kubis et al., 2004). Upon sequence alignment of the respective A-domains, it turned out that the highest sequence variability between Toc159 paralogs lies within these acidic regions, whereas the G- and M-domains are quite conserved (Ivanova et al., 2004). From swapping A-domains between the different atToc159 isoforms *in planta* it was concluded that receptor specificity is achieved by the respective A-domains (Inoue et al., 2010). This hypothesis was, however, questioned by a recent proteomic study which analyzed the proteome of the *ppi2* mutant plants lacking atToc159 (Bischof et al., 2011). Many proteins involved in photosynthesis have been found to be present in the mutant plastids, clearly implying that import of these preproteins does not exclusively rely on atToc159. At least in the mutant plants the other paralogs can complement for the loss of atToc159. How the distinct receptor components distinguish their substrates *in vivo* remains to be elucidated.

FSD1 clearly represents a protein with photosynthesis related function, since the scavenging of reactive oxygen species is highly relevant during active photosynthesis. Thus, one might have expected to find it prominently bound to psToc159. The fact that we found it associated with the newly identified ortholog

of atToc120, psToc120, implies that the import pathway which is engaged by a protein might rather depend on intrinsic sequence properties than on its final function within plastids. Concerning the composition of the translocon responsible for FSD1 import other than Toc120, we can only speculate. The fact that we also found peptides from Toc75 argues for FSD1 using the Toc75 import channel. In Arabidopsis Toc120 and Toc132 associate with Toc75 and Toc33/34; this results in the existence of several distinct complexes with the one common element being the channel Toc75. Thus, we hypothesize that FSD1 is specifically recognized by Toc120 (and maybe Toc132) and then engages Toc75. This is exemplarily represented in the model in **Figure 12**. But at this point it is just a hypothesis which awaits confirmation. Another scenario that could be envisioned is that distinct Toc complexes exist in pea - as has been shown in Arabidopsis—that consist of different combinations of psToc159, −132, and −120 with Toc34. These distinct Toc complexes with different Toc GTPase receptors could have different, but overlapping, substrate specificities accounting for the partial competition of FSD1 for pSSU import. This would be in line with the hypothesized situation in other systems that have already been shown to have multiple Toc159 isoforms.

When we aimed to determine the regions in FSD1 responsible for binding to psToc120 by peptide array analysis, we found that indeed specific areas of the protein bind more strongly to the receptor than others (**Figure 10**). The reliability of the array was demonstrated by the fact that we detected the N-terminal peptide used for cross-linking among the most strongly bound regions. From that array, we defined six regions within FSD1 Chang et al. Alternative TOC receptors in pea

which seemed important for binding to the A-domain. They are evenly distributed across the protein including the N- and C-termini. To confirm these data we generated C-terminal truncations to be applied in import assays, which in fact corroborated the regions essential for binding and import of FSD1 (**Figure 11**). While the extreme C-terminus itself is not important (**Figure 9**), the C-proximal region five, which is most strongly labeled in the peptide array, proved to be indispensable. This is in line with data reported by Lee et al. (2006), Lee et al. (2009) that even in pSSU, the classical canonical precursor protein, sequence information within the mature part of the protein is required for efficient translocation.

# **AUTHOR CONTRIBUTIONS**

WaiLing Chang conducted experiments and was involved in drafting of the manuscript. Jürgen Soll and Bettina Bölter designed the work and wrote the manuscript. All authors approved the final version.

### **ACKNOWLEDGMENTS**

We acknowledge funding from the SFB 594. We thank Prof. Axel Imhof for performing the MS analyses.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpls.2014.00239/ abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 January 2014; accepted: 12 May 2014; published online: 28 May 2014. Citation: Chang W, Soll J and Bölter B (2014) A new member of the psToc159 family contributes to distinct protein targeting pathways in pea chloroplasts. Front. Plant Sci. 5:239. doi: 10.3389/fpls.2014.00239*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Chang, Soll and Bölter. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The RNA-binding protein RNP29 is an unusual Toc159 transport substrate

#### *Julia Grimmer 1, Anja Rödiger 1, Wolfgang Hoehenwarter 2, Stefan Helm1 and Sacha Baginsky1 \**

*<sup>1</sup> Plant Biochemistry, Institute of Biochemistry and Biotechnology, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany <sup>2</sup> Proteomeanalytik, Leibniz Institute of Plant Biochemistry, Halle (Saale), Germany*

#### *Edited by:*

*Kentaro Inoue, University of California, Davis, USA*

#### *Reviewed by:*

*Matthew D. Smith, Wilfrid Laurier University, Canada Lan-Xin Shi, University of California, Davis, USA*

#### *\*Correspondence:*

*Sacha Baginsky, Plant Biochemistry, Institute of Biochemistry and Biotechnology, Martin-Luther-University Halle-Wittenberg, Weinbergweg 22, 06120 Halle (Saale), Germany e-mail: sacha.baginsky@ biochemtech.uni-halle.de*

The precursors of RNP29 and Ferredoxin (Fd2) were previously identified in the cytosol of *ppi2* plant cells with their N-terminal amino acid acetylated. Here, we explore whether precursor accumulation in *ppi2* is characteristic for Toc159 client proteins, by characterizing the import properties of the RNP29 precursor in comparison to Fd2 and other Toc159-dependent or independent substrates. We find specific accumulation of the RNP29 precursor in *ppi2* but not in wild type or *ppi1* protoplasts. With the exception of Lhcb4, precursor accumulation is also detected with all other tested constructs in *ppi2*. However, RNP29 is clearly different from the other proteins because only precursor but almost no mature protein is detectable in protoplast extracts. Co-transformation of RNP29 with Toc159 complements its plastid import, supporting the hypothesis that RNP29 is a Toc159-dependent substrate. Exchange of the second amino acid in the RNP29 transit peptide to Glu or Asn prevents methionine excision but not N-terminal acetylation, suggesting that different N-acetyltransferases may act on chloroplast precursor proteins *in vivo*. All different RNP29 constructs are efficiently imported into wild type but not into *ppi2* plastids, arguing for a minor impact of the N-terminal amino acid on the import process.

**Keywords: plastid protein import, protoplast, RNP29, precursor accumulation, mass spectrometry, import specificity**

# **INTRODUCTION**

Most chloroplast localized proteins are encoded in the nuclear genome and synthesized at cytosolic ribosomes as precursor proteins with N-terminal transit peptides. After import, transit peptides are cleaved by a stromal processing peptidase (SPP) and imported proteins are processed to their mature form (Richter and Lamppa, 1999). Based on training sets with known and established chloroplast proteins, software tools were developed that predict for individual proteins their subcellular localization. Although prediction is error-prone, overall prediction performance for *Arabidopsis thaliana* chloroplast proteins is relatively good, with a true positive prediction rate in the range of 75–85% (van Wijk and Baginsky, 2011). However, sequence features that mediate chloroplast protein import specificity are currently not known (Agne and Kessler, 2009). Recognition and selection of chloroplast-imported proteins are mediated by GTPbinding proteins that belong to two small families: Toc34/33 and Toc159/132/120/90. The Toc159 family members possess a GTP-binding domain (G domain) and a membrane anchoring domain (M domain). They differ by the length of an acidic domain (A domain) that is located N-terminal to the G- and M-domains. Depletion of the major Toc receptors usually results in a defect in photosynthetic growth as demonstrated by decreased accumulation of photosynthetic proteins in *ppi1* and *ppi2* (Jarvis, 2008).

A combination of reverse genetic studies and precursor binding assays suggested two different classes of receptors, one class comprising Toc159/90 and the other class comprising Toc132/120 (Ivanova et al., 2004; Kubis et al., 2004; Jarvis, 2008; Agne and Kessler, 2009; Strittmatter et al., 2010; Schleiff and Becker, 2011). It was proposed that Toc132 and Toc120 are specific for the import of non-photosynthetic proteins while Toc159 and Toc90 are involved in the import of photosynthetic proteins, although this simplified view has been recently challenged (Bischof et al., 2011; Dutta et al., 2014). It is unclear how the different Toc receptors recognize their target proteins but it is conceivable that specificity is mediated by the interaction of Tocreceptor-family members with the transit peptide of precursor proteins (Agne and Kessler, 2009). Functionally relevant amino acid motifs were identified in the RbcS transit peptide but these are not conserved in other photosynthetic proteins (Lee et al., 2008). A recent report suggested that the A-domain of the Toc159 receptor family is involved in mediating precursor selectivity (Inoue et al., 2010). Loss of the A-domain resulted in import receptors with less selective preprotein recognition. This result could explain why over-expression of full length Toc132 or Toc120 failed to complement *ppi2* while constructs containing only the G- and M-domains of Toc132 were able to do so (Inoue et al., 2010).

In an attempt to characterize Toc159 import specificity, Bischof and colleagues performed a comprehensive proteome analysis with Toc159-depleted plant material. The authors identified many photosynthetic proteins that were imported into *ppi2* plastids while many non-photosynthetic functions were affected by the Toc159 mutation, arguing for higher client protein promiscuity than previously anticipated (Bischof et al., 2011). Many proteins whose accumulation was affected in *ppi2* were down-regulated at the transcript level arguing for a complex effect on protein accumulation that does not necessarily indicate the dependence of their import via Toc159. This complex regulation makes it difficult to distinguish true substrates of the Toc159 import pathway from systemic regulation. In fact, a systematic survey for an albino plantspecific proteome phenotype provided evidence that much of the changes in the proteome of albino plants follow common systemic regulation, so that *ppi2* as an albino plant shows typical features of all albino plants including the downregulation of photosynthetic genes and proteins (Motohashi et al., 2012).

Interestingly, the study by Bischof and colleagues identified precursor proteins that accumulate outside of plastids in the *ppi2* mutant, but not in wild type. This observation was interpreted as a direct consequence of the import defect that would argue for a specificity of Toc159 for the accumulated proteins (Bischof et al., 2011). Usually, precursor proteins are degraded quite rapidly in case they are not imported into plastids, most likely via the ubiquitin proteasome system (UPS) as demonstrated for Lhcb4 (Lee et al., 2009). Notably, Bischof and colleagues found most accumulated precursor proteins N-terminally acetylated. While N-terminal acetylation was assumed to prevent protein degradation since the early 90's it was recently reported as degradation signal for the proteasome in yeast (Hwang et al., 2010). This supports a model in which plastid precursor proteins are modified in the cytosol to decrease their half-life and such avoid their accumulation in an unfolded state. Among these proteins are the known Toc159-dependent protein Ferredoxin and the currently unknown Toc159 target RNP29. RNP29 is a plastid RNA binding protein with two tandem repeat RNA-recognition motifs (RRM) and an N-terminal acidic domain (Lorkovic and Barta, 2002; Kupsch et al., 2012). Using Ferredoxin and RNP29 as a model, we assessed whether their accumulation in *ppi2* is indicative for Toc159 substrate specificity and whether the N-terminal amino acid and its acetylation play a role in protein import.

# **MATERIALS AND METHODS PLANT MATERIAL**

After 2 days of stratification at 4◦C *Arabidopsis thaliana* (Columbia-0) and *ppi2* (Toc159, CS11072 introgressed into the Columbia-0 ecotype) (Kubis et al., 2004) were grown on halfstrength Murashige and Skoog (M&S) medium supplemented with 0.8% (w/v) plant agar (Duchefa) and 3% (w/v) sucrose under short day conditions at 21◦C for 5 weeks before harvesting the seedlings and directly preparing protoplast. Cultivation of *ppi1* (ecotype Wassilewskija) was identical with the exception that M&S medium contained 0.8% (w/v) sucrose. To analyze proteins in wild type and *ppi2* by immunoblotting, plants were grown as described above for 1 week. Afterwards seedlings were transferred into liquid half-strength M&S medium supplemented with 0.8% (w/v) sucrose for further 20 d growth under short day conditions at 21◦C.

#### **PROTOPLAST PREPARATION AND TRANSFORMATION**

Plants were harvested immediately after the end of the dark period, with root tissue excluded. The plant material was transferred into 400 mM sorbitol, 5 mM MES (pH 5.6), 8 mM CaCl2, cut into shreds and incubated after vacuum infiltration in enzyme solution [400 mM sorbitol, 5 mM MES (pH 5.6), 8 mM CaCl2, 1.5% (w/v) Cellulase Onozuka R-10 (Serva), 0.375% (w/v) Macerozyme R-10 (Serva)] for 4 h at room temperature in the dark. Protoplasts were released by gentle shaking. After filtration (100μm BD Falcon™ cell strainer) the number of protoplasts was estimated using a Neubauer chamber. The protoplasts were settled by centrifugation (100×*g*, 5 min) and adjusted to a concentration of 2 × 10<sup>6</sup> protoplasts per ml in 230 mM NaCl, 187 mM CaCl2, 7.5 mM KCl, 7.5 mM glucose, 2.3 mM MES (pH 5.6). After chilling on ice for 30 min the protoplasts were settled again by centrifugation (100×*g*, 5 min) and transferred into 0.4 M sorbitol, 15 mM MgCl2, 5 mM MES (pH 5.6) maintaining the concentration of 2 × 10<sup>6</sup> protoplasts per ml. One hundred microliter of protoplast solution was mixed up with 10μg plasmid DNA each and 110μl PEG solution [60% (w/v) PEG4000 (Fluka), 0.3 M sorbitol, 0.15 M Ca(NO3)2] and incubated for 20 min at room temperature. Protoplasts were washed twice with 230 mM NaCl, 187 mM CaCl2, 7.5 mM KCl, 7.5 mM glucose, 2.3 mM MES (pH 5.6) and once with protoplast culture medium (M&S medium, 350 mM sorbitol, 50 mM glucose, 3 mM CaCl2, pH 5.8) including 0.1 mg/ml Ampicillin. Transformed protoplasts were stored in protoplast culture medium in darkness.

#### **PLASMID CONSTRUCTION**

The vector backbone of all plasmids was pRT100 -/Not/Asc (Uberlacker and Werr, 1996) containing the coding sequence of eGFP (Clontech). Wild type Arabidopsis cDNA was used as template to amplify the coding sequence for the first 100 amino acids of the proteins of interest by PCR (Supplementary Figure S1 and Table S1). Sequence was first ligated into pCR2.1®-TOPO® vector (TA cloning®, Invitrogen) and subsequently cloned into the target vector in frame up stream of the eGFP sequence. The plasmids pRT100 -/Not/Asc\_eGFP as well as pRT100 -/Not/Asc \_FNR1−55:eGFP were provided by Ralf Bernd Klösgen. The plasmid Toc159 inserted in the binary vector pCHF7 that was used for complementation was provided by Birgit Agne and Felix Kessler.

#### **FLUORESCENCE MICROSCOPY**

Confocal laser scanning microscopy (CLSM) was performed with a LSM 510 Meta confocal microscope (Carl Zeiss Microscopy, Jena, Germany) at the earliest 20 h after transformation. An argon laser (458, 488, 514 nm) was used, setting an excitation of 488 nm to excite eGFP as well as chlorophyll. Two beam splitters were used: HFT 405/488 and NFT 545, to separate eGFP and chlorophyll fluorescence a BP 505-530 filter and a LP 615 filter were set as well. Images were taken with a Plan-Apochromat 63x/1.40 Oil objective in the channel mode. Pictures were edited using the Zeiss LSM Image Browser.

#### **SDS-PAGE AND WESTERN ANALYSIS**

Protoplast proteins were extracted 23 h after transformation by adding SDS sample buffer [50 mM Tris/HCl (pH 6.8), 2% (w/v) SDS, 10% (v/v) Glycerol, 0.1 M DTT, 0.04-Bromphenol blue] and heating the extract for 5 min at 90◦C. Every sample represents a doubled transformation reaction. Whole protein extract was separated by SDS-PAGE on 12% polyacrylamide gels and transferred onto polyvinylidene difluoride membranes by semidry blotting. To generate total protein extracts of untransformed wild type and *ppi2*, shock-frozen seedlings were grinded and exposed to Rensink buffer [100 mM NaCl, 50 mM Tris/HCl (pH 7.5), 0.5% (v/v) Triton X-100, 2 mM DTT] including plant protease inhibitor cocktail (Sigma-Aldrich) rotating for 20 min at 4◦C. Bradford protein quantification (Bradford, 1976) was done before chloroform/methanol precipitation (Wessel and Flugge, 1984). One hundred fifty microgram protein extract was separated by SDS-PAGE on 12% polyacrylamide gels and transferred by tank blotting onto polyvinylidene difluoride membranes. Immunodetection of proteins was done using enhanced chemiluminescence, and images were obtained by the Fusion Fx7 image-acquisition system (Peqlab). The following antibodies were used: antiGFP (Sigma-Aldrich), antiLhcb4 (Agrisera), antiActin (Sigma-Aldrich) and antiRNP29 [Christian Schmitz-Linneweber (HU Berlin)].

#### **SAMPLE PREPARATION FOR MS ANALYSES**

SDS-polyacrylamide gels were stained with Coomassie Brilliant Blue. The gel sections corresponding to the apparent molecular weight of proteins of interest were cut. These gel slices were digested with trypsin as previously described (Rodiger et al., 2014). Dried peptides were stored at −20◦C for further analyses.

#### **NANO-LC SEPARATION, HD-MS<sup>E</sup> DATA ACQUISITION AND PROTEIN IDENTIFICATION/QUANTIFICATION**

LC separation and HD-MS<sup>E</sup> data acquisition was performed as previously described (Helm et al., 2014) using 1μl from each of the in gel digested samples, dissolved in 2% (v/v) ACN, 0.1% (v/v) FA, on a ACQUITY UPLC System coupled to a Synapt G2-S mass spectrometer (Waters, Eschborn, Germany). MS acquisition was set to 50–5000 Da. Data analysis was carried out by ProteinLynx Global Server (PLGS 3.0.1, Apex3D algorithm v. 2.128.5.0, 64 bit, Waters, Eschborn, Germany) with automated determination of chromatographic peak width as well as MS TOF resolution. Lock mass value for charge state 2 was defined as 785.8426 Da/e and the lock mass window was set to 0.25 Da. Low/high energy threshold was set to 180/15 counts, respectively. Elution start time was 5 min, intensity threshold was set to 750 counts. Databank search query (PLGS workflow) was carried out as follows: Peptide and fragment tolerances was set to automatic, two fragment ion matches per peptide, five fragment ions for protein identification, and two peptides per protein. Maximum protein mass was set to 250 kDa. Primary digest reagent was trypsin with one missed cleavage allowed. According to the digestion protocol fixed (carbamidomethyl on Cys) as well as variable (acetylation at the N-terminus and oxidation on Met) modifications were set. The false discovery rate (FDR) was set to 4% at the protein level. MS<sup>E</sup> data were searched against the modified *A. thaliana* database (TAIR10, ftp://ftp.arabidopsis.org) containing common contaminants such as keratin (ftp://ftp.thegpm.org/fasta/cRAP/ crap.fasta). Additionally the RNP291−100:eGFP fusion protein as well as the variants A2E and A2N were included. Redundant entries as well as splice variants were removed for database searching. Quantification was performed based on the intensity of the three most abundant proteotypic peptides (Silva et al., 2006). The manual response factor was set to 20,000 counts/fmol.

#### **LIQUID CHROMATOGRAPHY AND MASS SPECTROMETRY LTQ ORBITRAP VELOS PRO AND PEPTIDE/PROTEIN IDENTIFICATION**

Total trypsin protein digest was injected into an EASY-nLCII nano liquid chromatography system (Thermo Fisher Scientific). The peptides were separated using C18 reverse phase chemistry with an EASY-column SC001 pre-column (length 2 cm, inner diameter 100μm, particle diameter 5μm) in-line with an EASYcolumn SC200 (length 10 cm, inner diameter 75μm, particle diameter 3μm) both from Thermo Fisher scientific using gradient elution with an organic content increasing linearly from 5 to 40% in 30 min. Peptides were electrospayed on-line into an LTQ-Orbitrap Velos Pro mass spectrometer using a nano-bore stainless steel emitter in a Nanospray Flex ion source all from Thermo Fisher scientific.

The electrospray voltage was set to 1.9 kV, the capillary temperature to 275◦C, the RF Lens level to 50% and the difference in multipole offset to −7 V to ensure a stable electrospray with a current around 1μA. Both ion trap (IT) and Orbitrap (FT) injection waveforms were enabled. FT mass spectra were internally calibrated on the fly with the lock mass function using the ambient mass 445.1200. A data dependent acquisition (DDA) method with an inclusion list was used to isolate, fragment and record MS/MS spectra of only ions on the inclusion list with the 20 most intense signals in a scan of the total ion population (MS full scan) in the Orbitrap mass analyzer using collision induced dissociation (CID) in the linear trap quadrupole (LTQ) mass analyzer. The precursor mass tolerance was ±10 ppm. One microscan was acquired for both MS full and MS/MS scans. The minimum precursor ion signal intensity threshold was set to 1000, the isolation width to 2 Da. The automatic gain control (AGC) was set to 1e+06 for the Orbitrap and 1e+04 for the LTQ mass analyzer; the maximum injection times were 500 and 200 ms, respectively.

Alternatively, a targeted method employing an MS full scan followed by MS/MS acquisition of all ions on the global mass list irrespective of their signal intensity in the preceding full scan in the LTQ was used. The isolation width was set to 3 Da, the other parameters were as above.

Raw files from the mass spectrometer were imported into the Proteome Discoverer v.1.4 mass analysis environment (PD) from Thermo Fisher Scientific. Database search of the TAIR10 database with target proteins and common contaminants added (32,793 sequences, 14,486,974 residues) was conducted using the Mascot software v2.4.0 connected to PD to identify peptides and proteins. For peak list generation, a signal to noise threshold of 1.5 was used to filter peaks from MS full scans. For the database search, the precursor ion tolerance was set to 7 ppm, the fragment ion tolerance to 0.8 Da. The enzyme was set to trypsin, 2 missed cleavages were tolerated. Protein N-terminal acetylation was set as a variable modification, phosphorylation of serine and oxidation of methionine were included in alternative searches; carbamidomethylation of cystein was set as a fixed modification. The family wise PSM error rate was controlled with FDR/q-values using the reversed target/decoy database model of the null hypothesis for PSM with the target decoy PSM validator module in PD.

# **RESULTS**

#### **ACETYLATION OF PLASTID PRECURSOR PROTEINS IN THE CYTOSOL OF THE** *ppi2* **MUTANT**

We have previously reported N-terminal acetylation of precursor proteins in the cytosol of the plastid protein import deficient mutant *ppi2* (Bischof et al., 2011). To analyze the sequence context of N-terminal precursor acetylation, we extracted from the previous dataset all identified acetylated precursor proteins (**Table 1**). In the list of 13 acetylated precursors, all carry an alanine in the second amino acid position following the initiator methionine. Furthermore, nine of the 13 precursors carry another alanine in the third position (**Table 1**), while the remaining four carry serine, leucine, valine, or glutamate in position three. We identified the N-terminal peptide exclusively with the initiator methionine removed and in all cases only the acetylated, but not the non-acetylated precursor was detected (Bischof et al., 2011). This observation is consistent with a co-translational acetylation process that operates on nascent precursor proteins that fulfill the sequence requirements for acetylation. In yeast, the alanine in position two typically triggers methionine excision (Sherman et al., 1985) that is a prerequisite for N-terminal acetylation by Atype N-acetyltransferases (NatA) (Polevoda and Sherman, 2003; Martinez et al., 2008). In analogy to the yeast system, our data suggest that a NatA-type enzyme may be responsible for precursor acetylation (Hollebeke et al., 2012). Using a chloroplast proteome reference table (Reiland et al., 2009; van Wijk and Baginsky, 2011) we analyzed chloroplast proteins for the occurrence of alanine at position two and three in the transit peptide. Out of 1524 nucleusencoded chloroplast precursor proteins, 746 carry an alanine in position two (49%), of which 131 (17.5%) carry an alanine also in position three. Our dataset reported above is thus highly enriched (69.2%) for MAA- containing transit peptides suggesting high substrate specificity for the acetylation reaction (**Table 1**).

#### **RNP29—AN UNUSUAL SUBSTRATE OF Toc159**

MAA- containing precursor proteins may accumulate in the cytosol of *ppi2* plants, because they are direct substrates for Toc159 and cannot be imported in its absence. In order to address this question, we selected identified precursors and characterized their import properties in a protoplast system in greater detail. We selected RNP29, because (i) it starts with MAA- and as such represents a typical acetylation substrate and (ii) it represents a new Toc159 substrate because as RNA-binding protein it does not conform to the previous assumption that Toc159 is mainly involved in the import of photosynthetic proteins (Bauer et al., 2000). For characterization of RNP29 import properties, we fused the 100 most N-terminal amino acids including the transit peptide with an eGFP reporter protein and transformed wild type and *ppi2* protoplasts with these constructs. Using a combination of microscopy and western blotting, we established plastid protein import by co-localization of eGFP and chlorophyll fluorescence and by the mobility shift that is induced upon protein import by proteolytic processing. Since the specific cleavage of the transit peptide occurs exclusively in the plastid, a difference in size between preprotein and mature protein strongly suggests correct protein import (Zybailov et al., 2008; Bischof et al., 2011).

The protoplast protein import data are presented in **Figure 1**. We compared the RNP29 import characteristics with those of Fddependent NADP reductase (FNR), Ferredoxin, pyruvate dehydrogenase E1α and Lhcb4 eGFP fusion proteins. Ferredoxin was chosen as substrate because it was also found as acetylated precursor protein in *ppi2* (**Table 1**) and its Toc159-dependent import


**Table 1 | N-terminally acetylated peptides in chloroplast precursor proteins.**

*The list was generated from data obtained by Bischof et al. (2011). All putative plastid proteins with conflicting reports on subcellular localization (e.g., as reported in SUBA, http://suba.plantenergy.uwa.edu.au/, including those proteins that are dual targeted were removed from the list).*

was previously reported (Smith et al., 2004). Since it carries a serine in position three, it does not conform to the MAA- Nterminus of acetylated precursor proteins (**Table 1**). The E1α transit peptide carries a threonine in position two and its import is thought to be Toc159-independent (Bischof et al., 2011). In contrast, Lhcb4 carries an MAA-N-terminus and its import was found to be Toc159-dependent (Lee et al., 2009). However, Lhcb4 was not identified as an acetylated precursor in *ppi2* by Bischof and colleagues, suggesting further diversification among putative Toc159 client proteins (Bischof et al., 2011).

fusion protein using an anti-GFP antibody; arrows indicate positions of

Transiently expressed eGFP without any transit peptide is localized in the protoplast cytosol, whereas eGFP fused to the transit peptide of spinach FNR is efficiently transported into and processed in chloroplasts of wild type (**Figure 1A**). Cleavage of the transit peptide shifts the apparent size of the fusion protein FNR1−55:eGFP from 36 to 27 kDa (**Figure 1A**). A fusion of the first 100 amino acids of plastid localized proteins Fd2, E1α, and Lhcb4 with the eGFP reporter also results in a chloroplast localization of the fusion proteins and a shift of protein size in the western blot analysis. Thus, all precursor constructs are efficiently imported into wild type chloroplasts (**Figure 1A**).

bars = 10μm (further information in Supplementary Figure S2).

Surprisingly, a similar picture was obtained with *ppi2* protoplasts with the exception that precursor proteins accumulate to a much higher abundance compared to the mature protein (**Figures 1A,B**). This entails FNR1−55:eGFP, Fd21−100:eGFP and E1α1−100:eGFP. While the plastid localization of the eGFP constructs in *ppi2* protoplasts is hard to prove by microscopy (**Figure 1C**), the western blot analysis clearly shows successful import of all fusion proteins into plastids of *ppi2* (**Figure 1A**). In comparison to wild type, the ratio of unprocessed and processed FNR1−55:eGFP, Fd21−100:eGFP and E1α1−100:eGFP is significantly shifted toward the preprotein. In *ppi2* protoplasts a small amount of mature Lhcb4 was detected by western blot analysis, while Lhcb4 preprotein was not identified. This can be explained either by efficient degradation of Lhcb4 precursor by the UPS as a result of inefficient import or by a low transformation rate particularly with this construct.

The data obtained with the RNP29 construct support its Toc159-dependent import into plastids. While the construct is efficiently imported into wild type chloroplasts, only the precursor but not the mature form of RNP29 is detectable in *ppi2* protoplasts (**Figure 1B**). None of the control constructs shown in **Figure 1A**, revealed such a distinctive difference between wild type and *ppi2* import properties. The RNP29 precursor appears quite stable in *ppi2* protoplasts, similar to the FNR-, Fd2- and the E1α-precursor that are readily detectable in *ppi2* protoplasts but in contrast to the Lhcb4 precursor (**Figures 1A,B**). The FNR1−55:eGFP, Lhcb41−100:eGFP and RNP291−100:eGFP constructs were also transiently expressed in *ppi1* mutant protoplasts, that are devoid of Toc33. The Lhcb41−100:eGFP construct shows a similar accumulation pattern as in the *ppi2* mutant most likely for the same reasons as discussed above. In contrast to the *ppi2* mutant, the FNR1−55:eGFP and the RNP291−100:eGFP constructs are efficiently imported into *ppi1* plastids and precursor proteins are not detectable (**Figure 2A**). This supports the conclusion that the defect in RNP29 import is specifically due to the lack of Toc159. Again, the low amount of detectable mature protein in *ppi1* could result from degradation of non-imported protein or low transformation rates. Nonetheless the fact that only mature protein but not precursor protein can be detected suggests that Toc159 can form import competent Toc complexes that are independent of Toc33, which explains the weaker phenotype of *ppi1* compared to *ppi2* (Kubis et al., 2004).

We analyzed the accumulation of RNP29 and Lhcb4 in wild type and in *ppi2* seedlings by western blotting. Consistent with our protoplast assays, Lhcb4 accumulates to much lower levels in *ppi2* compared to wild type and only the mature form but no precursor was found to accumulate (**Figure 2B**). The low Lhcb4 accumulation in seedlings may partially result from transcriptional regulation that we can exclude for the Lhcb4 construct in *ppi2* protoplasts, because we used the 35S promoter to trigger its expression. The low accumulation of mature Lhcb4 is therefore probably also partly due to posttranslational degradation as a consequence of inefficient import, e.g., by the UPS as suggested earlier (Lee et al., 2009). Surprisingly, mature RNP29 accumulates to similar levels in wild type and in *ppi2* plastids (**Figure 2B**). We cannot identify the precursor of RNP29 in this western blot because its molecular mass is almost identical to the molecular mass of RNP29A that is also recognized by this antibody (**Figure 2B**) (Schmitz-Linneweber, personal communication). A possible explanation for the accumulation of a mature

RNP291−100:eGFP in *ppi2* protoplasts cotransformed with full length Toc159. The reporter protein was detected with an anti-GFP antibody; arrows indicate the positions of preproteins (light gray) and mature proteins (dark gray). The amido black stained membrane is shown as loading control.

Toc159-dependent substrate in *ppi2* plastids could be a developmental re-organization of the import machinery that allows RNP29 to enter the plastid during early stages of development via a Toc159-independent route (Teng et al., 2012) (see Discussion).

The result of the RNP29 antibody blot creates a paradox that requires further attention. It is unlikely that the accumulation of RNP29 precursor as reported above is a protoplast artifact, because this protein is readily imported into wild type and *ppi1* plastids and because precursor accumulation was originally found in *ppi2* seedlings, suggesting that this precursor also accumulates under *in vivo* conditions (Bischof et al., 2011). In order to lend further support to or reject the hypothesis that Toc159 is responsible for RNP29 import, we co-transformed the RNP291−100:eGFP with a Toc159 construct into *ppi2* protoplasts. In the co-transformed protoplasts, import of RNP29 is complemented and mature RNP29 accumulates (**Figure 2C**), albeit to a much lower extend compared to wild type (**Figure 1B**). Thus we conclude that Toc159 is required for the import of the RNP29 preprotein. This support the hypothesis that precursor selectivity of the Toc receptors depends on physico-chemical properties of the protein and/or the transit peptide that are not restricted to photosynthetic proteins (Bischof et al., 2011; Dutta et al., 2014). It is not clear which sequence properties mediate the Toc159-dependent import of RNP29 but as the list of Toc159 client proteins grows, a statistical assessment to pinpoint such characteristics will become feasible.

#### **THE N-TERMINAL AMINO ACID DOES NOT AFFECT PROTEIN IMPORT EFFICIENCY OR PRECURSOR ACCUMULATION**

To reveal the effect of the N-terminal amino acid and its acetylation on RNP29 import, we compared the import efficiency between different RNP29 precursor constructs. Since precursor acetylation shows all characteristics of the combined action of methionine aminopeptidase and NatA-type enzymes, we reasoned that an exchange of the second amino acid should sufficiently alter the properties of the nascent precursor to decrease the efficiency of methionine cleavage and N-acetylation by A-type N-acetyltransferases. To this end, we exchanged Ala in position two (in RNP291−100:eGFP) to Glu (RNP291−100A2E:eGFP) or Asn (in RNP291−100A2N:eGFP). In wild type protoplasts, mature RNP29 was identified with all three constructs, while in *ppi2* protoplasts only the precursor proteins are detectable (**Figures 3A,B**). This accumulation characteristic is consistent between the three constructs and no obvious alterations in response to the amino acid exchange at position two are visible. The eGFP reporter is clearly located in the chloroplast of wild type protoplasts with no obvious difference between RNP291−100:eGFP,

**FIGURE 3 | Import efficiency of different RNP29 constructs.** The second amino acid of RNP291−100:eGFP (RNP29) Ala was replaced by Glu in A2E or Asn in A2N. Import efficiency of the substrates was analyzed in transiently transformed wild type (WT) and *ppi2* protoplasts with different methods, showing the import efficiency of the A2E and the A2N construct. **(A)** CLSM of transiently transformed protoplasts,

scale bars = 10μm. **(B)** Western blot analysis using an anti-GFP antibody. The amido black stained membrane is shown as loading control. **(C)** and **(D)** Quantification of preprotein and mature protein using MSE. Diagrams showing the absolute quantification with standard deviation **(C)** and the relation between preprotein and mature protein **(D)** for each construct in wild type (WT) and *ppi2*.

RNP291−100A2E:eGFP and RNP291−100A2N:eGFP (**Figure 3A**). The accumulation of the three constructs differed in different biological replicates, but the differences were not consistent between the replicates.

It is likely, that differences in the abundance of RNP29 result from differences in the protoplast transformation and protein expression rates. Thus, a comparison of import efficiency between the different constructs requires a relative measure independent of the transformation rate. We therefore determined the ratio between mature RNP29 and precursor protein by quantitative mass spectrometry via HD-MSE (Helm et al., 2014). The measured amount of total transiently expressed protein varies between 11.7 and 6.9 fmol in wild type and 4.6 and 1.4 fmol in *ppi2* protoplasts (**Figure 3C**). The ratio between preprotein and mature RNP29 differs dramatically between wild type and *ppi2*, but no consistent differences between the constructs within one genotype were observed. For the presented wild type replicate, we determined a stable amount between 85 and 88% of total RNP29 to be correctly imported while only between 0 and 11% mature protein were identified in *ppi2* (**Figure 3D**). The MS-based quantification is consistent with the western blot analyses and allows us to conclude that RNP29 import efficiency is not affected by the single amino acid exchange at position two (**Figures 3B–D**).

For the interpretation of the results it is important to know whether the amino acid exchange prevented methionine cleavage and/or N-terminal acetylation as expected (see above). To address this question, we digested the gel band containing the precursor protein with trypsin and searched for the N-terminal peptide by inclusion mass scanning on an Orbitrap-Velos. We allowed different combinations of methionine removal, with or without methionine oxidized, and with or without acetylation. Surprisingly, we identified the three N-terminal precursor peptides in their acetylated form (**Table 2** and Supplemental Figure S3). The wild type construct had its methionine removed and its alanine acetylated, as expected. Also here, no nonacetylated peptides were identified. The A2E and A2N constructs retained their N-terminal methionines, and were still found to be acetylated. Thus, although we managed to prevent methionine cleavage with the amino acid exchange, we did not prevent N-terminal acetylation suggesting that either different types of N-acetyltransferases act on plastid precursor proteins or that the Arabidopsis enzymes have broader substrate specificity compared to the yeast system.

#### **Table 2 | N-terminal modifications of RNP29 constructs identified in** *ppi2* **protoplasts.**


*A targeted MS approach resulted in the identification of acetylation and methionine excision of the authentic RNP29 N-terminal peptide. The modified peptides A2E and A2N were acetylated at the start methionine. The peptide spectral matches (PSMs) and the ion score were generated by the Mascot software.*

# **DISCUSSION**

Here we establish RNP29 as Toc159-dependent client protein, which is in line with the original interpretation of RNP29 precursor accumulation in *ppi2* plastids (Bischof et al., 2011). RNP29 reveals characteristics that are similar to the already known Toc159 client protein Fd2 (**Table 1**) because the import of both proteins is facilitated by co-transformation with Toc159 (**Figure 3C**). In contrast to Ferredoxin, the accumulation of RNP29 as a precursor is much more pronounced and almost no mature protein is detectable in *ppi2* protoplasts, while imported Ferredoxin is readily detectable (**Figures 1A,B**). This is surprising in light of the western blot results that showed endogenous RNP29 protein amounts in wild type and in *ppi2* plants up to comparable levels (**Figure 2B**). The specificity of the import machinery changes during development and it is conceivable that its re-organization adjusts import efficiency and specificity to prevailing requirements (Li and Teng, 2013). It is possible, that RNP29 enters the plastid via Toc159-independent routes in *ppi2* that are in place at a certain developmental stage. This possibility is supported by the observation that RNP29 expression peaks at around 96 h after germination, suggesting that the mass import of RNP29 occurs early in development (Wang et al., 2006). It is furthermore possible, that low efficiency import of RNP29 occurs throughout plastid development via other Toc receptors. Provided that sufficient time is available for import, proteins such as RNP29 may enter the plastid in an unspecific manner and accumulate to near wild type amounts. Such a scenario is supported by the observation that around 11% of total expressed RNP29 fusion protein is imported into *ppi2* plastids (**Figure 3D**).

A stable cytosolic precursor as well as a high expression level support a long residence time of RNP29 in the cytosol. We know that RNP29 is not down-regulated at the transcriptional level in *ppi2*, and the fact that we identified it as one of very few precursors in the cytosol argues for its relatively high stability (Bischof et al., 2011). In the artificial protoplast system, we find most precursor proteins to accumulate in *ppi2*, with the exception of Lhcb4 (**Figure 1**). Thus, these precursors are relatively stable in the cytosol and their lack of detection in *ppi2* seedlings is probably a result of fine-tuned transcriptional regulation (Bischof et al., 2011). This is clearly different for Lhcb4. While most photosynthetic proteins are down-regulated at the transcriptional level in *ppi2* and other albino plants, we can exclude transcriptional regulation for Lhcb4 in our artificial protoplast system. The lack of Lhcb4 precursor detection here (**Figure 1**) therefore argues for a tight quality control system that affects certain types of precursor proteins, degrading them efficiently before they can accumulate. Since such degradation occurs for Lhcb4, but not for other precursors in our protoplast system, we conclude that different degradation- or stabilization-mechanisms exist for precursor proteins in the cytosol.

N-acetylation is a common co-translational modification in higher eukaryotes that is mediated by N-acetyltransferases (NATs). The second amino acid is crucial for substrate specificity of NATs in yeast and mammalian systems, and it is conceivable that a similar mechanism exists in plants (Hollebeke et al., 2012). More than 50% of the plastid precursor proteins carry an alanine in position two, making them typical NatA substrates. NatA activity depends on N-terminal methionine excision suggesting that an exchange of Ala to Glu or Asn at position two would not only abolish methionine cleavage but also N-terminal acetylation by an A-type NAT. For RNP29 we find that the exchange of the second amino acid indeed eradicates methionine cleavage but not N-terminal acetylation (**Table 2**). This is surprising and suggests that not only NatA-type enzymes act on plastid precursor proteins, but also NatB-type enzymes. Only very little is known about the substrate specificity and the function of N-acetyltransferases in plant systems. Recently, a loss of function mutant in a non-catalytic component of a NatB-type enzyme complex was characterized. The analysis revealed pleiotropic phenotypes including changes in flowering time regulation and leaf, inflorescence, flower, fruit and embryonic development (Ferrandez-Ayela et al., 2013). So far, the only cytosolic N-acetyltransferase with a known effect on chloroplast development is AtMak3 that resembles NatC-type enzymes. Its defect results in delayed chloroplast development, however, its target protein spectrum and its functions are currently unknown (Pesaresi et al., 2003).

At present, the role of precursor acetylation in the import process remains elusive. Similar to the yeast system, precursor acetylation may be a *degron* that ensures low residence time of non-imported plastid proteins in the cytosol (Hwang et al., 2010). While this would be an elegant possibility, there is currently no indication that N-terminal acetylation serves as *degron* in plants. In contrast, acetylation of proteins in the chloroplast stroma in Chlamydomonas even increases their half-lives (Bienvenut et al., 2011). Furthermore, N-terminal acetylation could affect the interaction of the N-terminal transit peptide region with heat shock proteins. One of the few identified functional regions in transit peptides is a short, uncharged N-terminal segment that seems capable to function as Hsp70-binding domain. This interaction is important for the formation of translocation intermediates, thus any change in the interaction properties could affect the import of precursor proteins (Chotewutmontri et al., 2012). Acetylation in this region could strengthen the interaction of precursor with heat shock proteins and thus either determine the efficiency of precursor interaction and/or even the specificity of import. We are currently investigating these different possibilities.

Recent attempts to identify functionally relevant domains in transit peptides were not successful, even after grouping proteins into relevant categories such as "Toc159-dependent" and "Toc159-independent" (Bischof et al., 2011). This is probably because transit peptides contain different modules that can be arranged in a diverse order (Li and Teng, 2013). Thus, any assembly of proteins at larger-scale will most likely average out an otherwise significant enrichment of amino acids in a functional domain. Comparing the transit peptides of Toc159 dependent client proteins such as Fd2 and RNP29 with the Toc159-independent E1α and FNR transit peptide, we find for the latter a smaller uncharged N-terminal region that is interrupted by lysine, an occurrence of unusual amino acids such as glutamic acid in E1α and aspartic acids in FNR as well as histidine proximal to the N-terminus, followed by a more pronounced stretch of hydrophobic amino acids in the E1α transit peptide sequence. In the RNP29 transit peptide, a putative degenerated FGLK motif is found in two positions, one around amino acid 36 that is consistent with its position in other transit peptides (Chotewutmontri et al., 2012). This one is lacking glycine as the "helix-breaking" amino acid in its closer surrounding. A complete motif is found at position 20. Whether or not this has any relevance for the observations we made here remains unclear. Further experiments are necessary to understand the design of transit peptides. Sufficient hypotheses are available and await further testing (Li and Teng, 2013).

# **ACKNOWLEDGMENTS**

We would like to thank Paul Jarvis (University of Leicester) for *ppi2* ecotype Col-0, Christian Schmitz-Linneweber (HU Berlin) for anti-RNP29 antibodies, Ralf Bernd Klösgen (Martin-Luther-University Halle-Wittenberg) for the plasmids pRT100 -/Not/Asc\_eGFP as well as pRT100 -/Not/Asc\_FNR1−55:eGFP, Birgit Agne and Felix Kessler (University of Neuchâtel) for the plasmids pCHF7\_Toc159 and Birgit Agne (Martin-Luther-University Halle-Wittenberg) for valuable discussion and support. This work was supported by DFG grant "Ba 1902/3-1" and by the European Regional Development Fund of the European Commission via grant W21004490 "Landesförderschwerpunkt Molekulare Biowissenschaften," Land Sachsen-Anhalt to Sacha Baginsky. Sacha Baginsky gratefully acknowledges DFG support for the acquisition of a Synapt G2-S mass spectrometer (INST 271/283-1 FUGG).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpls.2014.00258/ abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 April 2014; accepted: 20 May 2014; published online: 16 June 2014. Citation: Grimmer J, Rödiger A, Hoehenwarter W, Helm S and Baginsky S (2014) The RNA-binding protein RNP29 is an unusual Toc159 transport substrate. Front. Plant Sci. 5:258. doi: 10.3389/fpls.2014.00258*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Grimmer, Rödiger, Hoehenwarter, Helm and Baginsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The C-terminus of *Bienertia sinuspersici* Toc159 contains essential elements for its targeting and anchorage to the chloroplast outer membrane

#### *Shiu-Cheung Lung1, Matthew D. Smith2, J. Kyle Weston2, William Gwynne3, Nathan Secord3 and Simon D. X. Chuong3 \**

*<sup>1</sup> School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China*

*<sup>2</sup> Department of Biology, Wilfrid Laurier University, Waterloo, ON, Canada*

*<sup>3</sup> Department of Biology, University of Waterloo, Waterloo, ON, Canada*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Patrick H. Masson, University of Wisconsin-Madison, USA Mitsuru Akita, Ehime University, Japan*

#### *\*Correspondence:*

*Simon D. X. Chuong, Department of Biology, University of Waterloo, Room B1-268, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada e-mail: schuong@uwaterloo.ca*

Most nucleus-encoded chloroplast proteins rely on an N-terminal transit peptide (TP) as a post-translational sorting signal for directing them to the organelle. Although Toc159 is known to be a receptor for specific preprotein TPs at the chloroplast surface, the mechanism for its own targeting and integration into the chloroplast outer membrane is not completely understood. In a previous study, we identified a novel TP-like sorting signal at the C-terminus (CT) of a Toc159 homolog from the single-cell C4 species, *Bienertia sinuspersici*. In the current study, we have extended our understanding of the sorting signal using transient expression of fluorescently-tagged fusion proteins of variable-length, and with truncated and swapped versions of the CT. As was shown in the earlier study, the 56 residues of the CT contain crucial sorting information for reversible interaction of the receptor with the chloroplast envelope. Extension of this region to 100 residues in the current study stabilized the interaction via membrane integration, as demonstrated by more prominent plastid-associated signals and resistance of the fusion protein to alkaline extraction. Despite a high degree of sequence similarity, the plastid localization signals of the equivalent CT regions of *Arabidopsis thaliana* Toc159 homologs were not as strong as that of the *B. sinuspersici* counterparts. Together with computational and circular dichroism analyses of the CT domain structures, our data provide insights into the critical elements of the CT for the efficient targeting and anchorage of Toc159 receptors to the dimorphic chloroplasts in the single-cell C4 species.

**Keywords:** *Bienertia sinuspersici***, Toc159, outer envelope protein, transit peptide, plastid, dimorphic chloroplast, translocon, protein targeting**

# **INTRODUCTION**

In plant cells, chloroplasts are one of the many types of plastids, which play crucial roles in photosynthesis and other metabolic pathways including amino acid and lipid synthesis, and nitrogen and sulfur assimilation (Keeling, 2004). Therefore, assembly of the correct plastid proteome is crucial for proper functioning of plants and their responses to developmental and external cues. In spite of the presence of a plastid genome, the vast majority of plastid proteins are encoded by the nuclear genome, synthesized in the cytosol as precursor proteins (preproteins), and post-translationally imported into the organelle. The targeting and translocation processes are facilitated by information embedded within the N-terminal sequences of preproteins, known as transit peptides (TPs). In the cytosol, preproteins associate with chaperones (i.e., HSP70 and HSP90; Zhang and Glaser, 2002; Qbadou et al., 2006; Ruprecht et al., 2010) and cochaperones (e.g., HOP and FKBP; Fellerer et al., 2011), and it has been reported that some TPs can be phosphorylated serving as putative binding motifs for 14-3-3 dimers (Waegemann and Soll, 1996; May and Soll, 2000; Martin et al., 2006). At the chloroplast surface, preprotein translocation across the envelope is mediated by the coordinate action of two multiprotein complexes, commonly known as the Translocon at the outer envelope membrane of chloroplasts (Toc) and the Translocon at the inner envelope membrane of chloroplasts (Tic).

The core Toc complex is composed of two GTPases (i.e., Toc159 and Toc34) and a β-barrel protein channel (i.e., Toc75). The two GTPases are also known as the Toc receptors for their cooperative role in controlling the recognition of preproteins, and regulation of preprotein transfer to the translocation channel. TP binding at the GTPase domains (i.e., G-domains) of the Toc GTPases triggers changes in receptor dimerization, GDP/GTP exchange and GTP hydrolysis, ultimately resulting in precursor protein transfer to Toc75 (see Richardson et al., 2014 for review). Despite the homology of the GTPase domains of Toc159 and Toc34, the former has an additional N-terminal acidic domain (i.e., A-domain), which is intrinsically unstructured and highly divergent among isoforms, implicating its ability to distinguish between a wide variety of substrates (Richardson et al., 2009; Dutta et al., 2014). In *Arabidopsis thaliana*, the major (i.e., atToc159) and minor (i.e., atToc90, atToc120, and atToc132) isoforms have been hypothesized to be responsible for the recognition of photosynthetic and housekeeping preproteins, respectively (Ivanova et al., 2004; Kubis et al., 2004; Smith et al., 2004; Infanger et al., 2011). Recently, swapping and yeast-two hybrid studies confirmed that the A-domain of Toc159 is an important determinant of substrate selectivity of the Toc complex (Inoue et al., 2010; Dutta et al., 2014), although, selectivity appears to be conferred by information intrinsic to each preprotein TP, rather than by the function of the protein in chloroplasts (Dutta et al., 2014; Grimmer et al., 2014). On the other hand, the G-domains might constitute a molecular switch as elucidated in many other intracellular protein sorting and translocation processes. The crystal structure of Toc34 G-domain has led to the unraveling of its dimerization properties and functions (Sun et al., 2002; Koenig et al., 2008). While recent biochemical analyses have revealed the relevance of preprotein binding to homodimer dissociation and nucleotide exchange of Toc34 (Oreb et al., 2011), the heterodimeric interaction is crucial for the insertion of Toc159 into the Toc complex with Toc34 serving as a docking site (Bauer et al., 2002; Smith et al., 2002a). In contrast to a single transmembrane α-helix which anchors Toc34 to the chloroplast surface (Kessler et al., 1994; Seedorf et al., 1995), the absence of any hydrophobic cluster raises a question regarding how Toc159 integrates into the chloroplast outer membrane (Bölter et al., 1998; Chen et al., 2000). Conventionally, the entire C-terminal domain (e.g. ∼52 kDa in *Pisum sativum*) of the tripartite Toc159 has been referred to as "membrane domain" (i.e., M-domain) solely for its resistance to proteolysis in intact chloroplasts, which implies that it is embedded in a hydrophobic environment (Waegemann et al., 1992; Hirsch et al., 1994; Bauer et al., 2000; Chen et al., 2000). Previously, Lee et al. (2003) demonstrated that the minimal functional unit of Toc159 is constituted by the M-domain, of which overexpression could partially rescue the albino phenotype of the *atToc159* knockout mutant of *A. thaliana* (i.e., *ppi*2). Despite its importance, the study of the M-domain is still in its infancy.

Whilst most of the current knowledge of chloroplast protein import is based on the observations in *P. sativum* and *A. thaliana*, we have recently identified homologs of Toc receptors from the single-cell C4 species, *Bienertia sinuspersici* (Lung and Chuong, 2012). This species from the family Chenopodeaceae is of particular interest due to its novel mechanism of C4 photosynthesis through subcellular compartmentation of organelles and enzymes within single chlorenchyma cells (Akhani et al., 2005; Chuong et al., 2006). The differential partitioning of nucleusencoded enzymes between dimorphic chloroplasts implicates the existence of multiple sorting pathways, which could be mediated by the preferential assembly of distinct substrate-specific Toc complexes at distinct subcellular locations (Offermann et al., 2011; Lung et al., 2012). Recently, we showed that the *B. sinuspersici* genome encodes multiple isoforms of Toc159, which are targeted to the dimorphic chloroplasts by a novel C-terminal TP-like sorting signal (Lung and Chuong, 2012). In the current study, we have extended our investigation into the elements of the BsToc159 C-terminus (CT) that are involved in chloroplast targeting and envelope association. We used a number of enhanced green fluorescent protein (EGFP) fusion constructs to differentiate the regions that are required for targeting from those that are important for anchoring the receptor to the chloroplast outer membrane. EGFP fusion proteins with the equivalent regions of the *A. thaliana* homologs and swapping experiments revealed some variation in plastid-associated signals of Toc159 CTs from different species. Overall, our data extend the understanding of the chloroplast targeting information contained within the CT region of Toc159, and reinforce the role it may play in controlling differential subcellular localization to the dimorphic chloroplasts in *B. sinuspersici*.

# **MATERIALS AND METHODS**

#### **PLANT MATERIALS AND GROWTH CONDITIONS**

Seeds from wild-type *A. thaliana* (ecotype Columbia-0) were stratified at 4◦C in the dark for 48 h and sowed on 5-cm-tall cell packs containing a 1:1 soil mixture of Sunshine LC1 Mix and Sunshine LG3 Germination Mix (SunGro Horticultural Inc., Bellevue, WA, USA). The plants were maintained in a controlled environment chamber with a day/night photoperiod of 16/8 h at 22◦C with a photon flux density of ca. 150μmol m−<sup>2</sup> s <sup>−</sup>1and were watered and fertilized regularly with 20:20:20 (N:P:K) fertilizer (Plant Products Co. Ltd., Brampton, ON, Canada). True leaves from 2- to 3-week-old plants were used for protoplast preparation.

#### **FLUORESCENT PROTEIN FUSION CONSTRUCTS**

The construction of the AtOEP7-EGFP construct has been described previously (Lung and Chuong, 2012). The other constructs were made by subcloning specific DNA fragments of interest into the pSAT6-35S:DsRed2-N1 or pSAT6-35S:EGFP-C1 vectors (Chung et al., 2005). The transit sequence of ferredoxin was excised from a previous construct (Lung et al., 2011) and subcloned at the 5 end of the DsRed2-encoding sequence. The C-terminal sequences of Toc159 were obtained by PCR amplification from cDNA clones of the respective isoforms, of which the sequences can be found in the GenBank under the following accession numbers: *B. sinuspersici* Toc159 (JQ739199), *B. sinuspersici* Toc132 (JQ739200), *A. thaliana* Toc159 (AC002330), and *A. thaliana* Toc132 (AC005825). Details of the primers and restriction sites used for generation of the EGFP fusion constructs are listed in Supplementary Table S1. All constructs have been verified by DNA sequencing.

#### **BIOLISTIC BOMBARDMENT OF ONION EPIDERMAL CELLS**

Onion (*Allium cepa*) bulbs were purchased from local grocery stores. Briefly, one milligram of tungsten particles (∼1.1μm in diameter; Bio-Rad) were coated with plasmid DNA (EGFP and DsRed2 fusion constructs, 5μg each) in a suspension containing 16 mM spermidine and 0.1 M CaCl2. The DNA-coated tungsten particles were loaded onto the macrocarrier discs and bombarded into the adaxial surface of onion bulb sections (1 cm2) from a distance of 12 cm at a pressure of 1350 p.s.i. using a Biolistic PDS-1000/He particle delivery system (Bio-Rad). The bombarded samples were incubated in Petri dishes on moist filter paper at room temperature in the dark for 16 h, and observed under epifluorescence microscopy.

#### **PROTOPLAST ISOLATION AND TRANSFECTION**

The procedures for isolation and transfection of mesophyll protoplasts from *A. thaliana* were modified from Yoo et al. (2007). Briefly, leaves of 3-week-old seedlings were cut into 0.5- to 1 mm strips and incubated in enzyme solution [0.4 M mannitol, 20 mM MES-KOH (pH 5.7), 20 mM KCl, 10 mM CaCl2, 0.1% (w/v) bovine serum albumin, 1.5% (w/v) cellulase Onozuka R10 and 0.4% (w/v) macerozyme R10 (Yakult Pharmaceutical, Tokyo, Japan)] at room temperature in the dark for 3 h. The isolated protoplasts were pelleted with equal volume of W5 solution [2 mM MES-KOH (pH 5.7), 154 mM NaCl, 125 mM CaCl2 and 5 mM KCl] at 100 g for 2 min, resuspended in W5 solution, and allowed to settle on ice for 30 min. The settled protoplasts were resuspended in MES/Mg2<sup>+</sup> buffer [0.4 M mannitol, 4 mM MES-KOH (pH 5.7), 15 mM MgCl2] at a density of ca. 200,000 protoplasts mL<sup>−</sup>1. Approximately 160,000 protoplasts were mixed with 40 μg of plasmid DNA and 880μL of polyethylene glycol solution [40% (w/v) PEG4000 (Sigma-Aldrich), 0.4 M sucrose, 0.1 M CaCl2]. After incubation at room temperature for 15 min, the transfected protoplasts were mixed with 3.5 mL of W5 solution, pelleted at 100 g for 2 min, resuspended in 4 mL of WI solution [0.5 M mannitol, 4 mM MES-KOH (pH 5.7), 20 mM KCl], and cultured overnight at 23◦C with a light intensity of ca. 30μmol m−<sup>2</sup> s −1. The protoplasts were examined in flat-bottomed depression slides under epifluorescence microscopy.

#### **EPIFLUORESCENCE MICROSCOPY**

Epifluorescence micrographs were acquired using a Zeiss Axio Imager D1 microscope equipped with a Zeiss AxioCam MRm camera (Carl Zeiss Inc., Germany). All images were processed and composed using Adobe Photoshop CS (Adobe Systems Inc.). Representative images were selected from at least three independent experiments. The dual-channel images of transfected onion epidermal cells were analyzed and the corresponding scatterplots, Pearson's correlation coefficients and Manders' coefficients were generated using the open-source Fiji "Colocalization Threshold" plug-in (Schindelin et al., 2012) of Image J software v.1.46 (National Institutes of Health, USA).

#### **CHLOROPLAST ISOLATION FROM TRANSFECTED PROTOPLASTS**

The procedures for isolating chloroplasts from the transfected protoplasts were modified from Smith et al. (2002b). Briefly, the transfected protoplasts were pelleted with equal volume of W5 solution at 100 g for 2 min, and resuspended in 300μL of HS buffer [330 mM sorbitol, 50 mM HEPES-KOH (pH 7.3)]. To assemble a protoplast-rupturing device, the needle-fitting end of a 1-mL syringe barrel and the top part of a 500-μL microfuge tube were cut off to form a hollow tube and a slightly wider adaptor ring, respectively. A piece of 10-μm nylon mesh filter (Spectrum Lab Inc.) was fitted against the cut end of the hollow tube and held in place using the adaptor ring. All subsequent steps were carried out at 4◦C. The resuspended protoplasts were lysed by passage through the nylon mesh using the protoplast-rupturing device, and the intact chloroplasts were purified on a Percoll step gradient consisting of an upper 500-μL Percoll solution [40% (v/v) Percoll, 50 mM HEPES-KOH (pH 7.3), 330 mM sorbitol, 1 mM MgCl2, 1 mM MnCl2 and 2 mM EDTA] and a lower 500-μL Percoll solution [85% (v/v) Percoll, 50 mM HEPES-KOH (pH 7.3) and 330 mM sorbitol]. The gradient was centrifuged at 2500 g for 10 min in a swinging-bucket rotor, and the intact chloroplasts at the 40%/85% interface of Percoll were aspirated and diluted with 6 volumes of HS buffer. The isolated chloroplasts were concentrated by centrifugation at 750 g for 5 min and resuspended in 50μL of HS buffer.

#### **SUBFRACTIONATION OF ISOLATED CHLOROPLASTS INTO MEMBRANE AND SOLUBLE FRACTIONS**

The isolated chloroplasts were subfractionated into the membrane and soluble stromal fractions as described previously (Smith et al., 2002b). Briefly, 40μL of isolated chloroplasts were hypo-osmotically lysed by incubation with 213μL of 2 mM EDTA on ice for 10 min. To facilitate membrane precipitation, the lysed chloroplasts were mixed with 13.3μL of 4 M NaCl. After centrifugation at 20,000 g, 4◦C for 30 min, the membrane pellet was resuspended in 25μL of solubilization buffer [50 mM Tris-HCl (pH 8), 5 mM EDTA, 0.2% (w/v) SDS], and the soluble stromal proteins in the supernatant were precipitated with 4 volumes of acetone at −20◦C for >1 h and resuspended in 25μL of solubilization buffer. Similarly, the total protoplast lysates were fractionated into insoluble and soluble fractions using the same procedures.

### **IMMUNOBLOT ANALYSIS**

The protein concentrations of all samples were quantified by using Bicinchoninic Acid Protein Assay Kit (Pierce) against standard solutions of bovine serum albumin. The proteins (2.5μg) were resolved on SDS-PAGE and electroblotted onto polyvinylidene difluoride membranes. The blots were probed with primary antibodies against large-subunit of Rubisco (1:10,000; Agrisera, cat. no. AS03 037), Toc34 (1:16,000; Agrisera, cat. no. AS03 238) or EGFP (1:4000; Lung and Chuong, 2012), followed by a horseradish peroxidase-conjugated anti-rabbit secondary antibody (1:800,000; Sigma-Aldrich, cat. no. 6154). The chemiluminescence signals were produced using Amersham ECL-Advance solution (GE Healthcare) and captured by exposing the blots to Amersham Hyperfilm ECL films (GE Healthcare), which were developed using a CP1000 Agfa photodeveloper (AGFA). The films were scanned and processed using Adobe Photoshop CS (Adobe Systems). The intensities of immunoreactive bands were densitometrically quantified using the gelanalyzer function of ImageJ software v.1.46 (National Institutes of Health, USA).

#### **EXPRESSION AND PURIFICATION OF RECOMBINANT AtToc159MHis**

The M-domain of Toc159 was obtained by PCR amplification from the Arabidopsis cDNA (AC002330) and subcloned into the pET28a(+) expression vector (Novagen) for production of hexahistadine-tagged recombinant protein (AtToc159MHis). The recombinant protein was purified by immobilized metal ion affinity chromatography (IMAC) under denaturing conditions using the Profinity™ IMAC Ni2+-charged resins (Bio-Rad). The purified sample of AtToc159MHis was dialyzed against CD buffer [10 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 50 mM NaCl, 1 mM DTT].

#### **CIRCULAR DICHROISM**

Circular dichroism spectra were recorded for the recombinant AtToc159MHis protein in the range 190–260 nm using an Aviv 215 spectrometer (Aviv Associates Inc.) and a quartz cuvette of 0.005 cm path length. Two independent samples of AtToc159MHis at 9μM were tested; 4 scans at 0.5 nm/s were made at 0.5 nm intervals for each sample. And the spectra were averaged. The percentage of secondary structure was calculated by deconvoluting the averaged circular dichroism spectra using the online DICRHOWEB CD secondary structure server (Whitmore and Wallace, 2004).

# **RESULTS**

#### **THE CT OF Toc159 CONTAINS CHLOROPLAST-TARGETING AND CHLOROPLAST MEMBRANE-ASSOCIATING INFORMATION**

Our previous bioinformatics analyses predicted that the CT of BsToc159 shares similar physicochemical and structural properties with chloroplast TPs (Lung and Chuong, 2012). Specifically, a putative TP-like chloroplast-sorting signal of 51 amino acids together with a putative stromal processing peptidase cleavage site was identified using the neural network-based ChloroP predictor (Emanuelsson et al., 1999). Accordingly, our previous EGFP-fusion experiments were based on the ChloroP-predicted TP-like region at the CT of BsToc159 plus five additional residues, which successfully directed the reversible association of the passenger proteins with the outer envelope of chloroplasts (Lung and Chuong, 2012). It is a common practice to include some residues from the mature protein downstream of the predicted TP cleavage site when studying targeting of typical chloroplast preproteins (e.g., Lee et al., 2003, 2006, 2009). While the exact number of residues that should be included is not known, the original decision to include 5 additional residues beyond the predicted 51 amino acid TP-like chloroplast sorting signal of BsToc159 (Lung and Chuong, 2012) was based on other studies where approximately 5 amino acids were included (e.g., Ivanova et al., 2004; Smith et al., 2004; Inoue et al., 2010; Okawa et al., 2014). In the present study, we sought to further elucidate the functional region of the novel TP-like sorting signal used by BsToc159 and identify the essential region which mediates the successful integration of the receptor into the outer envelope membrane of chloroplasts. First, we produced a number of transient expression constructs by fusing various lengths of BsToc159 CT ranging from 50 to 100 residues (i.e., C50 to C100) to the CT of EGFP (**Figure 1**). To evaluate the efficiency of the variable lengths of BsToc159 CT as a plastid-sorting signal, the EGFP-fusion constructs were subjected to colocalization studies in onion epidermal cells, which were cotransformed with a DsRed2 construct fused with the ferredoxin TP to direct it to plastids (**Figure 2**). Interestingly, although we previously showed that the 56 most C-terminal residues (i.e., the C56 construct) could direct ca. 60% of EGFP signal to the chloroplast envelope using *A. thaliana* mesophyll protoplasts (Lung and Chuong, 2012), the diffuse fluorescent signals of C50, C56 and C60 fusion proteins indicated a cytoplasmic localization in onion epidermal cells (**Figures 2A–C**). Thus, the contrasting subcellular localization patterns of the C56 construct in the two cell types implicated some species-specific preferential targeting of the BsToc159 CT. When the BsToc159 CT was extended to include additional upstream residues (i.e., C70, C80 and C90), the fusion proteins appeared as punctate fluorescent spots, some of which were colocalized with the DsRed2-decorated plastids (**Figures 2D–F**). The non-plastid punctate structures that did not colocalize with the DsRed signals appeared irregular in size and shape most likely representing insoluble aggregates due to protein misfolding (**Figures 2D–F**). However, a general trend was apparent in that the proportion of plastid-localized EGFP signal increased with the length of the BsToc159 CT fusion from C70 to C90 (**Figures 2D–F**). As the length increased to C100, the vast majority of the EGFP signals colocalized with the DsRed2 decorated plastids as is evident from the merge of the two channels producing yellow punctate signals and the diagonal scattering pattern of pixels from both channels in a scatter plot (**Figure 2D**). Occasionally, the C100 fusion proteins also labeled tubular protrusions extending from the DsRed2-decorated plastids, reminiscent of stroma-filled extensions called stromules (**Figure 2G** inset; Köhler and Hanson, 2000). On the other hand, deletion of the C-terminal 56 residues of BsToc159 from the C100 construct completely abolished plastid-targeting, leading to diffuse EGFP signals, further confirming that this region contains key plastidtargeting information (**Figure 2H**). Quantitatively, the Pearson's correlation coefficients and the Manders' coefficients (Manders et al., 1993) confirmed that the C100 fusion protein was among the best colocalized with the plastids (**Figures 2I,J**).

To evaluate the chloroplast-targeting efficiency of the different regions of the BsToc159 CT, the same EGFP-fusion constructs were transiently expressed in *A. thaliana* mesophyll protoplasts. The transfected protoplasts were observed using fluorescence microscopy and the EGFP signals were also densitometrically measured following immunoblot analysis of transfected protoplasts fractionated into soluble and insoluble fractions (**Figure 3**). Under the microscope, the C50 fusion proteins were predominantly observed as diffuse signals with approximately a quarter of the signal detected in the chloroplast membrane-associated fraction (**Figure 3A**). Increasing the length of the BsToc159 CT by 6 residues (i.e., C56) effectively directed 60% of EGFP protein to the chloroplast surface, resulting in the ring-like appearance of fluorescent signals surrounding the chloroplasts (**Figure 3B**). Further increase of the BsToc159 CT by 4 residues (i.e., C60) did not alter the subcellular distribution of EGFP signals qualitatively or quantitatively as compared to the C56 construct, suggesting that the required targeting information of the CT is contained within the C-terminal 56 residues of BsToc159 (**Figure 3C**). The presence of a strong signal in the soluble fraction for the C56 and C60 constructs could be attributed to the absence of a chloroplast membrane anchor, rendering their association with the chloroplast envelope transient/reversible. Alternatively, the elevated levels of these constructs in the soluble fraction could be due to overexpression and therefore slow targeting of the proteins to the chloroplasts leading to cytosolic accumulation. This observation is in agreement with our previous findings that the envelope-associated C56 fusion proteins were susceptible to alkaline extraction, and that the addition of the Toc34 G-domain to the fusion protein effectively boosted the chloroplast membrane-associated signals to over 80% (Lung and Chuong, 2012). Similar to our observations in onion epidermal cells, the

C70, C80 and C90 constructs produced irregular punctate aggregates in addition to the ring-like signals encircling the chloroplasts (**Figures 3D–F**). Among the three constructs, the fluorescent signals at the chloroplast exterior were most prominent with C90 (**Figure 3F**). The fluorescent signals of the C100 construct were exclusively localized to the chloroplast envelope, whereas removal of the predicted chloroplast-targeting signal (i.e., C100-56) from this construct completely abolished chloroplast targeting, as expected (**Figures 3G,H**). The significantly higher abundance of chloroplast membrane-associated C100 signals compared to that of the C56 could be attributed to the presence of a membraneanchoring region stabilizing the association between the C100 fusion proteins and the chloroplast envelope. In fact, alkaline extraction of the chloroplasts isolated from transfected protoplasts prior to immunoblot analysis revealed a drastic difference in the relative resistance of the C56 (i.e., 20%) as compared to the C100 (i.e., 80%) fusion protein (**Figure 4**). Taken together, we believe that the essential and sufficient chloroplast-sorting information for BsToc159 is embedded within the C-terminal 56 residues, whereas the immediate upstream sequence is important for anchoring the protein to the chloroplast surface, potentially by an as yet undetermined membrane-associating structure(s), which may not be complete or folded properly in the truncation constructs C70, C80 and C90, leading to the formation of insoluble aggregates (**Figures 2D–F**, **3D–F**).

#### **THE M-DOMAIN OF Toc159 FORMS AN UNCONVENTIONAL MEMBRANE-ANCHOR**

To complement our findings from the truncation experiments, we further investigated the structure of the Toc159 M-domain (**Figure 5**). While the A-domain has been characterized as an intrinsically unstructured domain (Richardson et al., 2009) and the G-domain structure can be deduced from the crystal structure of its GTPase homolog Toc34 (Sun et al., 2002; Reddick et al., 2007; Yeh et al., 2007; Koenig et al., 2008), the structure of the M-domain has not been studied previously. First, to gain insight into its structural organization, the amino acid sequence of BsToc159 was analyzed using IUPRed (Dosztanyi et al., 2005) and FoldIndex (Prilusky et al., 2005) to predict intrinsically disordered and structured regions (**Figure 5A**). Concomitantly with the use of a neural network predictor for protein secondary structures by the PSIPRED algorithm, we further divided the M-domain of BsToc159 into three subdomains designated as M1, M2 and M3 (**Figure 5B**). The N-terminal region of the M-domain is linked to the central G-domain via a 150-residue M1 region, which is moderately unstructured except for a putative α-helical motif arranged in a predicted coiled-coil structure (**Figures 5A,B**), whereas the C-terminal 56-residue region containing the chloroplast targeting signal, designated as M3, contains a predicted amphipathic α-helix, which is also a structural feature of TPs (**Figures 5A,B**; Lung and Chuong, 2012). The core region of the M-domain, designated as the M2 subdomain, is predicted to be a β-strandrich region (**Figure 5B**). Since the resistance of the C100 fusion protein to alkaline extraction also implied that some structural features within part of this subdomain might be involved in chloroplast outer membrane association (**Figure 4**), we asked if the M2 subdomain had a tendency to fold into a β-barrel, which is a common conformation comprised of multiple amphipathic βstrands that span the outer membranes of gram-negative bacteria and endosymbiotic organelles (Walther et al., 2009). However, the M-domains of BsToc159 and *A. thaliana* homologs have a negligible probability (*P* = 0.05) of adopting a transmembrane β-barrel

scales were biolistically co-transformed with an EGFP fusion construct containing the **(A)** BsToc159-C50, **(B)** BsToc159-C56, **(C)** BsToc159-C60, **(D)** BsToc159-C70, **(E)** BsToc159-C80, **(F)** BsToc159-C90, **(G)** BsToc159-C100, or **(H)** BsToc159-C100-56 and a DsRed2 fusion construct with the ferredoxin transit peptide for transient protein expression driven by the constitutive 35S promoter. For each construct, representative images of EGFP (green), DsRed2 (red) and a merge of the two channels are shown. Colocalization of the green and red signals produced yellow signals. Scatter plots show the distribution of the green and red pixels in respectively, on the range of pixel gray values from 0 to 255. The clustering of pixels from both channels along a diagonal line represents colocalization. **(I)** The Pearson's correlation coefficients (R*<sup>r</sup>* ) of the two fluorescent channels. The maximum theoretical R*<sup>r</sup>* score is 1. The values represent the mean of four replicates (± SE). **(J)** Manders' coefficients M1 and M2. M1 indicates the fraction of green pixels which colocalized with red pixels, and M2 indicates the fraction of red pixels which colocalized with green pixels. The values represent the mean of four replicates (± SE). Scale bars = 50 and 5μm (inset).

**FIGURE 3 | Transient expression of EGFP fusion proteins with BsToc159 C-terminal regions in** *A. thaliana* **protoplasts.** Isolated protoplasts were transfected with various EGFP fusion constructs containing the **(A)** BsToc159-C50, **(B)** BsToc159-C56, **(C)** BsToc159-C60, **(D)** BsToc159-C70, **(E)** BsToc159-C80, **(F)** BsToc159-C90, **(G)** BsToc159-C100, or **(H)** BsToc159-C100-56 for transient protein expression driven by the constitutive 35S promoter. For each construct, representative images of EGFP (green) and chlorophyll fluorescence (red) and a merge of the two channels are

shown in the left panel. The subcellular localization was confirmed by immunoblot analysis with an anti-EGFP antibody after subfractionation of the total protoplasts or purified chloroplasts in pellet (P) and soluble (S) fractions (middle panels). Detection with an antibody against Rubisco large subunit (RbcL) served as loading controls for the soluble fractions. The immunoreactive bands of the total protoplast subfractions were densitometrically quantified (right panels). Each value represents the mean of three replicates (± SE). Scale bar = 10μm.

conformation, according to the PROFtmb prediction program (http://www.predictprotein.org; Bigelow et al., 2004). On the other hand, a BLAST search of the structural database deposited in the Protein Data Bank (PDB) using the *A. thaliana* Toc159 Mdomain as query sequence revealed considerable homology (i.e., 47%) of the M2 subdomain with the lipid-binding domain of UDP-3-*O*-acyl-glucosamine *N*-acyltransferase (LpxD), which is predominantly composed of β-strands (**Figure 5C**; Buetow et al., 2007), and is consistent with our prediction that M2 is a βstrand-rich region (**Figure 5B**). Interestingly, LpxD belongs to a rare family of left-handed β-helical proteins in which each coil is formed by three hexapeptide repeats of a consensus sequence (Buetow et al., 2007).

The secondary structure of purified recombinant AtToc159M refolded in 1% LDAO was then examined using far-UV CD spectroscopy. Qualitatively, the spectrum is indicative of the presence of α-helical elements, depicted by the double minima ellipticity at 208 nm and 222 nm, as well as β-strand elements, based on the minimum ellipticity at approximately 215 nm (**Figure 5D**; Park et al., 1992). Overall, the spectrum indicates an ordered, but complex, conformation comprised of a combination of helical, sheet and turn elements. The relatively lower minimum ellipticity at 208 nm, in comparison with the minimum at 222 nm, could indicate a tight interaction between the secondary structures and/or the possibility of self-association of the protein (Hoang et al., 2012, 2013). Deconvolution of the spectra using two different reference sets of known CD spectra suggests that AtToc159M is comprised of approximately 39.5% α-helical, 20% β-strand, 21% unfolded and 17.5% turn elements (Supplementary Table S2). This is in agreement with sequence-based structure (PSIPred) and disorder predictions implicating an α-helical segment within subdomain M1, a smaller segment in the middle of M2, and another large predicted α-helix near the CT of M3 (**Figure 5B**). While these predicted α-helical domains do not appear to amount to 40% of the entire M-domain, the N-terminal helical segment is predicted to form a coiled-coil structure. The qualitative shape of the curve is similar to that of previously characterized coiledcoil proteins, and it is possible that such a structure accounts for the high apparent α-helical content suggested by the deconvolution (Greenfield and Hitchcock-DeGregori, 1993). The prediction that the M-domain contains significant regions of disorder (i.e., segments flanking the coiled-coil region of M1), and β-strand (i.e., large proportion of the M2 subdomain) is also supported by the deconvolution of the CD spectrum (**Figures 5A,B**). Taken together, these findings led us to hypothesize that an independent region upstream of the CT TP-like sorting signal adopts a conformation that is involved in the interaction of Toc159 with the chloroplast outer membrane.

#### **THE CT OF BsToc159 DISPLAYS SPECIES-SPECIFIC TARGETING**

Consequently, we further investigated if the membraneassociation motif and the TP-like sorting signal identified within the C-terminal 100 residues of BsToc159 are present in other Toc159 homologs. Amino acid sequence alignment of multiple Toc159 homologs from *B. sinuspersici* and *A. thaliana* illustrated high homology of the C-terminal regions among members of the same Toc159 subtype (**Figure 6A**). For instance, the C-terminal 100 residues of BsToc159 exhibit 86% similarity with the aligned region of AtToc159, whereas the equivalent regions of BsToc132 and AtToc132 share 83.5% similarity (**Figure 6A**). In both cases, pairwise comparison revealed that sequence variation is primarily found at the CT ends (**Figure 6A**). On the other hand, the primary sequences are more divergent when comparing the two subtypes (i.e., Toc159 vs. Toc132) which share 58.8% overall similarity and only 19.3% identity (**Figure 6A**). From the sequence alignment, we defined the equivalent regions corresponding to BsToc159-C100 from the other *A. thaliana* (i.e., AtToc159-C101 and AtToc132-C97) and *B. sinuspersici* (i.e., BsToc159-C100 and BsToc132-C96) homologs for EGFP fusion studies in onion epidermal cells (**Figure 6B**) and Arabidopsis mesophyll protoplasts (**Figure 6C**). The fluorescence micrographs showed that the subcellular localization patterns of BsToc132-C96 (**Figures 6B,C**) were qualitatively and quantitatively indifferent from that of BsToc159-C100 (**Figures 2G**, **3G**), with strong association of the fusion proteins with the etioplasts of onion epidermal cells and the chloroplast envelopes of mesophyll protoplasts. Surprisingly, AtToc159-C101 and AtToc132-C97 did not produce strong plastid-associated signals in spite of their high primary sequence consensus with BsToc159-C100 and BsToc132-C96, respectively (**Figures 6B,C**). Neither the AtToc159-C101 fusion protein nor the AtToc132-C97 equivalent colocalized with the DsRed2-decorated etioplasts in onion epidermal cells (**Figure 6B**). In mesophyll protoplasts

#### **FIGURE 5 | Continued**

server v3.0 (Jones, 1999). The height of the blue bar for each residue represents the confidence level. Cylinders, arrows and lines symbolize α-helices, β-strands and coils, respectively. Based on the structural prediction, the M-domain is subdivided into the M1, M2 and M3 subdomains: M1 represents a moderately disordered region with a putative α-helical region, which was predicted to fold into a coiled-coil structure by COILS (Lupas et al., 1991); M2 represents a β-strand-rich region; M3 represents the C-terminal 56-residue sorting signal. **(C)** Amino acid sequence alignment of M2 region

transfected with AtToc159-C101 or AtToc132-C97 constructs, although fluorescent signals were observable surrounding the chloroplasts, considerable signals were also detected in the soluble fractions as well as associated with non-plastid punctate structures (**Figure 6C**). Taken together, we conclude that the CTs of both *B. sinuspersici* Toc159 homologs (i.e., BsToc159 and BsToc132) could mediate the targeting and stable association of EGFP fusion proteins with etioplasts and chloroplast envelopes, whilst the homologous counterparts from *A. thaliana* produced less conclusive results.

Due to the dissimilar plastid targeting results when using the CTs of Toc159 homologs from different species, we finally asked if the highly divergent 10- to 15-residues at the end of the CT tails constitute an important part of the plastid-targeting signal. Chimeric constructs were made by swapping the CT tails of BsToc159-C100 and AtToc159-C101, as well as between BsToc132-C96 and AtToc132-C97 (**Figure 7A**). In onion epidermal cells, both BsToc159-C100 and BsToc132-C96 efficiently directed EGFP to the etioplasts regardless of the swapped CT tails from *A. thaliana* homologs, whilst AtToc159-C101 and AtToc132- C97 could not guide EGFP to plastids despite the presence of *B. sinuspersici* CTs (**Figure 7B**). In mesophyll protoplasts, the targeting of BsToc159-C100 to the chloroplast outer membrane was only slightly diminished by swapping the CT domain with that of AtToc159 (compare **Figure 7C** with **Figure 3G**). In addition, replacing the CT domain of BsToc132 with that of AtToc132 also did not produce any observable effect on chloroplast targeting of BsToc132-C96 (compare **Figure 7C** with **Figure 6C**). On the other hand, replacing the CT of AtToc159-C101 and AtToc132-C97 with those of the corresponding *B. sinuspersici* CTs did not improve chloroplast targeting of the Arabidopsis proteins (compare **Figure 6C** with **Figure 7C**). The stronger plastid-associated signals obtained using the *B. sinuspersici* constructs compared to those of *A. thaliana* might be attributed to the species-specific sequence differences within the upstream region which stabilize chloroplast envelope association, independently of the highly divergent CT sequence which constitutes the sorting information. In fact, our previous data confirmed that the CTs of AtToc159 and AtToc132 could effectively re-target a Toc34 mutant protein to the chloroplast envelope, suggesting the presence of sufficient chloroplast-sorting information within their sequences (Lung and Chuong, 2012). In *B. sinuspersici*, the more stable chloroplast envelope association as mediated by the putative single-site variants could have some implications on the insertion of Toc159 receptors into the outer membrane of the dimorphic chloroplasts for differential preprotein targeting.

with the lipid-binding domain of UDP-3-*O*-acyl-glucosamine *N*-acyltransferase (LpxD). BLAST search was performed using the AtToc159 M-domain as query sequence against 3D structures deposited in the RCSB Protein Data Bank (http://www.pdb.org/pdb/search/searchSequence.do). The β-strands which constitute the left-handed β-helix of LpxD (Buetow et al., 2007) are annotated by red (if homologous to M2) and black arrows (otherwise). **(D**) Far-UV CD spectrum of purified recombinant AtToc159M*His*. The protein concentration was 9μM. MRE is the mean residue ellipticity in degrees cm<sup>2</sup> dmol<sup>−</sup>1. Temperature was 25◦C.

# **DISCUSSION**

## **Toc159 CT REPRESENTS A NEW CLASS OF SORTING SIGNAL TO THE CHLOROPLAST OUTER MEMBRANE**

Our recent discovery of the chloroplast-targeting information embedded within the BsToc159 CT using sequence-based bioinformatics predictions (Lung and Chuong, 2012) raises a number of fundamental questions about the nature of this novel sorting signal. For instance, what is the length of the signal that is essential for chloroplast sorting? Is there a membrane-integration element associated with the signal that is responsible for anchoring the receptor to the chloroplast envelope? Is this signal unique to the Toc159 isoform of the single-cell C4 species? In the current study, we have addressed these questions using transientlyexpressed fluorescent proteins fused to variable-length truncation and domain-swapped constructs, and have complemented this approach with structural analyses of the M-domain. Collectively, our data point to a novel class of sorting signals present in the Toc159 family of chloroplast protein import receptors for targeting to the chloroplast outer membrane. According to the Plant Proteome Database, approximately 47 different proteins are annotated to reside on the chloroplast outer membrane (http:// www.plantsciences.ucdavis.edu/kinoue/OM.htm; http://ppdb.tc. cornell.edu; Inoue, 2007; Sun et al., 2009; Breuers et al., 2011; Inoue, 2011). Although the mechanisms for targeting of these outer envelope proteins (OEPs) have not been completely elucidated, multiple pathways are apparent and, in many cases, the membrane-spanning domains constitute the protein sorting information (for reviews, see Hofmann and Theg, 2005; Bölter and Soll, 2011; Lee et al., 2013). With the exception of Toc75, which relies on an N-terminal TP for chloroplast targeting (Tranel et al., 1995), the other identified integral β-barrel proteins, including OEP21, OEP24 and OEP37, appear to selfinsert into the chloroplast outer membrane (Pohlmeyer et al., 1998; Bölter et al., 1999; Goetze et al., 2006). The majority of α-helical OEPs commonly contain a single hydrophobic α-helix which functions as a transmembrane anchor as well as a sorting signal (Hofmann and Theg, 2005; Bölter and Soll, 2011). Depending on whether the transmembrane domain is located at the N- or CT, these OEPs are broadly classified into the families of signal-anchored proteins (e.g., OEP7, OEP14, HKI, Toc64, CHUP1) (Li et al., 1991; Wiese et al., 1999; Sohrt and Soll, 2000; Lee et al., 2001; Oikawa et al., 2008) and tail-anchored proteins (e.g., OMP24, HPL, Toc34, OEP9) (Fischer et al., 1994; Chen and Schnell, 1997; Froehlich et al., 2001; Dhanoa et al., 2010), respectively. Although it had been originally proposed that these proteins are spontaneously integrated into the destination membrane without any energy requirement or proteinaceous factor (Schleiff

**FIGURE 6 | Transient expression of EGFP fusion proteins with the C-terminal regions of other Toc159 homologs. (A)** Amino acid sequence alignment of the Toc159 homologs from *B. sinuspersici* and *A. thaliana*. Sequence homologies among the Toc159 and Toc132 isoforms are shown in red and blue boxes, respectively. Alignment was performed using the AlignX module of Vector NTI Advance™ 10.3.0 (Invitrogen) and is displayed using the default color scheme: a red foreground on a yellow background denotes a 100% conserved residue; a dark green foreground on a white background denotes a residue with weak similarity to the consensus residue at a given position; a black foreground on a light green background

denotes a consensus residue in a block of similar residues at a given position; A blue foreground on a cyan background denotes a conserved residue with 50% or higher identity at a given position; A black foreground on a white background denotes a non-similar residue. **(B)** Colocalization analysis of EGFP fusion proteins in onion epidermal cells. EGFP was fused to the C-terminal regions of other Toc159 homologs equivalent to the EGFP-BsToc159-C100 construct based on the protein alignment as shown in **(A)**. Details are the same as in **Figure 2**. **(C)** Transient expression of EGFP fusion proteins in isolated *A. thaliana* protoplasts. Details are the same as in **Figure 3**.

**FIGURE 7 | Swapping of the short C-terminal tails between Toc159 homologs. (A)** Schematic representation of the C-terminal swapping constructs. Based on the amino acid sequence alignment (see details in **Figure 6A**), the highly variable C-terminal ends, as depicted by double arrows, were swapped between the *B. sinuspersici* and *A. thaliana* isoforms. The Toc159 and Toc132 isoforms are depicted in red and blue, respectively. **(B)** Colocalization analysis of EGFP fusion proteins in onion epidermal cells. Details are the same as in **Figure 2**. **(C)** Transient expression of EGFP fusion proteins in isolated *A. thaliana* protoplasts. Details are the same as in **Figure 3**.

and Klösgen, 2001), Bae et al. (2008) discovered a chaperonelike ankyrin repeat protein (i.e., AKR2A) in *Arabidopsis* which binds to the transmembrane domains and the CT regions of tailanchored proteins, and thereby functions as a cytosolic mediator for their specific sorting to the chloroplast envelope. To meet the criteria of tail-anchored proteins, a protein must exhibit three structural features: (i) the exposure of the majority of the protein to the cytosolic side; (ii) the presence of a single transmembrane domain at or near the CT, and; (iii) the protrusion of a short CT tail into the organelle interior (Kutay et al., 1993; Abell and Mullen, 2011). Despite the fact that Toc159 shares some structural resemblance to a tail-anchored protein and its GTPase homolog, Toc34, is a tail-anchored protein (Dhanoa et al., 2010), our studies revealed some important differences. First, we showed that the BsToc159 CT contains some chloroplast-sorting information, but this region does not appear to constitute a hydrophobic transmembrane α-helix (Lung and Chuong, 2012), as demonstrated by the susceptibility of BsToc159-C56 fusion proteins to alkaline extraction (**Figure 4**). In this study, we identified a discrete region within the 60–100 residues from the CT of BsToc159 that constitutes a membrane association domain, as demonstrated by the resistance of BsToc159-C100 proteins to alkaline extraction (**Figure 4**). Contrary to that of tail-anchored proteins, this membrane-associating region does not contain an insertion signal for the outer membrane of the plastid envelope since the C-terminally truncated construct (i.e., BsToc159-C100-56) produced cytosolic localization of the EGFP proteins (**Figures 2H**, **3H**). Previous truncation studies indicated that the hydrophilic CT immediately flanking the transmembrane domain of a tailanchored protein constitutes part of the sorting information but the CT tail itself could not direct fusion proteins to the chloroplast envelope (Lee et al., 2001, 2004; Dhanoa et al., 2010). On the other hand, the 56 residues of BsToc159 CT make up a hydrophilic tail, which could mediate the targeting of ca. 60% of EGFP to the chloroplast surface independently of the putative membrane anchor (**Figure 3B**; Lung and Chuong, 2012). While the ChloroP predictor suggested a 51-residue length of the TP-like sorting signal at the CT end of BsToc159 (Lung and Chuong, 2012), we observed that BsToc159-C56 outperformed BsToc159-C50 in the targeting of EGFP to the chloroplast envelope of *Arabidopsis* protoplasts (**Figures 3A,B**). The higher efficiency of targeting by the C56 construct could indicate that a longer sorting signal is required in the non-native context of a protein fusion. Similarly, it has been shown that a TP length of more than ca. 60 amino acids is required for efficient translocation of a passenger protein into chloroplasts (Bionda et al., 2010). It remains to be determined if the BsToc159 CT and typical chloroplast TPs employ similar targeting and translocation machineries. Previously, the TPs of 208 plastid proteins were grouped into seven subgroups with distinct sequence motifs by hierarchical clustering (Lee et al., 2008). The publicly available algorithm produced by Lee et al. (2008) did not identify any of the consensus motifs from the BsToc159 sequence (data not shown). The critical chloroplast-sorting motifs of the BsToc159 CT may be unraveled by the equivalent alanine substitution approach used by Lee et al. (2008) in the future. In conclusion, we have multiple lines of evidence to support the notion that Toc159 is not a tail-anchored protein but is targeted

to the chloroplast surface via a novel pathway under the guidance of a non-canonical sorting signal at the CT.

#### **Toc159 IS UNCONVENTIONALLY ANCHORED TO THE CHLOROPLAST OUTER MEMBRANE**

In addition to shedding light on the nature of a novel chloroplast-sorting signal within the BsToc159 CT, the present study has provided the first insight into a long-standing question regarding how the Toc159 receptor is associated with the chloroplast outer membrane. At the time of its discovery, independent researchers consistently observed a 52-kDa proteaseprotected product of Toc159 (formerly known as OEP86) in pea and Arabidopsis after "shaving" the cytosolically exposed proteins/protein domains from the surface of isolated chloroplasts by treatment with thermolysin, an outer membrane-impermeable protease (Waegemann et al., 1992; Hirsch et al., 1994; Kessler et al., 1994; Bölter et al., 1998; Bauer et al., 2000; Chen et al., 2000). The M-domain has been defined based on this biochemical evidence, but it has never been clear how, exactly, the entire Cterminal 52-kDa portion of Toc159 is associated with the chloroplast envelope. While no study has yet addressed this issue, it is of our particular interest to examine the nature of the membraneanchor of Toc159. Proteins traversing the envelope membranes of endosymbiotic organelles are structurally classified into two groups: α-helical transmembrane proteins and β-barrel proteins (Lee et al., 2013, 2014). Previous hydrophilicity analyses ruled out the possibility of Toc159 belonging to the former family due to the absence of a transmembrane α-helix (Kessler et al., 1994; Lung and Chuong, 2012). Although secondary structure prediction using the M-domain sequence of BsToc159 as query identified 16 consecutive β-strands in the central region (designated as the "M2 region" in this study; **Figure 5B**), they are too short (2–8 residues per strand; mean = 5.1 residue per strand) to represent the membrane-spanning regions (6–25 residues per strand) of a β-barrel protein (Taylor et al., 2006), which is in agreement with the negative result from the PROFtmb β-barrel predictor program (data not shown). Based on sequence analyses, we believe that the CTs of the Toc159 isoforms form a noncanonical anchor to the chloroplast outer membrane. The high homology of the central region of the AtToc159 M-domain (the M2 region) with the lipid-binding domain of LpxD, a left-handed β-helical protein, is consistent with our secondary structure prediction suggesting the presence of a short β-strand-rich region in the M-domain of Toc159, and also fosters the idea that Toc159 is anchored to the chloroplast outer membrane in a non-αhelical and non-β-barrel-dependent manner. Furthermore, the ratio of mean ellipticity at 208 nm and 220 nm of recombinant AtToc159 M-domain suggests the presence of associated forms of the protein (Hoang et al., 2013). The associations could be intramolecular interactions such as those that occur in the coiled-coil or β-coil structures, and/or intermolecular interactions between M-domain monomers. The precise nature of the associations cannot be elucidated from the current circular dichroism data; but the presence of associated forms is consistent with the predicted secondary structure elements of the M-domain (**Figure 5**). In the absence of additional structural data, it is premature to conclude that Toc159 adopts a lipophilic β-helix for associating with the chloroplast outer membrane. However, multiple lines of evidence support the notion that the M2 region of Toc159 constitutes a non-canonical membrane anchor: (i) a portion of the M2 sequence was sufficient to confer resistance of BsToc159-C100, but not BsToc159-C56, to alkaline extraction (**Figure 4**); (ii) the 180 residues from the CT end of PsToc159 also anchored a fusion protein to the chloroplast envelope with resistance to alkaline extraction (Muckel and Soll, 1996); (iii) truncation of BsToc159-C100 in blocks of 10 residues (i.e., C90, C80 and C70) progressively abolished the plastid-associated signals of EGFP fusion proteins (**Figure 2**); and (iv) the formation of irregular punctate structures of BsToc159-C70, C80 and C90 signals could be attributed to the disruption of an ordered structure essential for chloroplast association. In the future, a more in-depth structural analysis of the Toc159 M-domain will provide additional information about the nature of the unconventional membrane anchor, which will lead to insights into the function and mechanism of action of the Toc159 receptor. For instance, a number of reports have documented the partitioning of Toc159 between the cytosol and the chloroplast envelope, which suggests the possibility that Toc159 is a cycling receptor for preprotein recognition (Hiltbrunner et al., 2001; Bauer et al., 2002; Lung and Chuong, 2012). Thus, an unconventional membrane anchor (e.g., β-helix), in contrast with a transmembrane α-helix or a β-barrel, may account for the reversible association of Toc159 with the chloroplast outer membrane in support of the "cycling" hypothesis for Toc159-facilitated targeting of chloroplast preproteins (Hiltbrunner et al., 2001; Ivanova et al., 2004; Smith, 2006).

#### **SPECIFICITY FOR THE TARGETING OF Toc159 TO THE PLASTID ENVELOPE**

Our discovery of a novel sorting signal at the Toc159 CT raised another interesting question regarding factors that interact with the sorting signal to mediate the specific targeting of Toc159 to the chloroplast envelope. In this study, we showed that the TP-like sorting signal at the CT of BsToc159 (i.e., BsToc159- C56) could guide EGFP to chloroplasts of *A. thaliana* mesophyll protoplasts (**Figure 3B**) but not to plastids of onion epidermal cells (**Figure 2B**). Similarly, it has been shown that some TPs guided protein import preferentially into one plastid type over others (Wan et al., 1996; Yan et al., 2006). Elkehal et al. (2012) demonstrated that the different composition of lipids in chloroplast membranes could influence the Toc-mediated binding and import of preproteins into outer envelope vesicles, and more recently, Kim et al. (2014) have shown that lipids of the outer membrane serve as the receptor for AKR2A. In addition to the effect on protein import, the lipids are known to be determinants of the topology, folding and integration of membrane proteins (Schleiff et al., 2001; Dowhan and Bogdanov, 2009). However, no significant difference has been found in the glycerolipid composition of envelope membranes from chloroplasts and non-green plastids (Douce and Joyard, 1979), and it has been shown previously that the lipid composition of plastids does not change during the greening of wheat leaves (Bahl et al., 1976). Thus, we hypothesize that the distinctive subcellular localization patterns of BsToc159-C56 proteins in onion epidermal cells and *Arabidopsis* mesophyll protoplasts is not likely attributed to the lipid composition of the plastid envelope membranes. Alternatively, it could also be explained by the possibility that the species-specific properties of the *Bienertia* CT may contribute to this differential subcellular localization. Any negative correlation between the targeting efficiency of BsToc159-C56 to protoeoliposomes and their lipid compositions would reinforce the idea that some unknown proteinaceous factors interact with the sorting signal of Toc159 and mediate a specific subcellular sorting pathway. Due to the resemblance of the CT sorting signal of Toc159 to a typical chloroplast preprotein TP (Lung and Chuong, 2012), it is plausible that the Toc machinery plays a similar role in the recognition of Toc159 CTs at the chloroplast surface. In fact, Wallas et al. (2003) reported that both binding and insertion of AtToc159 proteins into proteoliposomes required Toc34 and Toc75. In this regard, the fact that different Toc complexes are assembled in green and non-green cell types with dissimilar substrate specificities (Bauer et al., 2000; Ivanova et al., 2004; Kubis et al., 2004; Smith et al., 2004; Dutta et al., 2014) is consistent with our observation of differential targeting of BsToc159-C56 proteins to etioplasts and chloroplasts (**Figures 2B**, **3B**). In addition to the Toc machinery at the chloroplast surface, the general import pathway of chloroplast preproteins involves chaperones, co-chaperones and other cytosolic factors (for review, see Lee et al., 2013). Although tail-anchored membrane proteins could be sorted efficiently to the chloroplast envelope with high fidelity in the absence of any cytosolic factor, the efficiency was higher with the supplementation of complete cytosol, Hsp70 or Hsp90 (Kriechbaumer and Abell, 2012). Recently, it has been demonstrated that AKR2A functions as a cytosolic mediator for targeting of outer envelope membrane proteins such as OEP7, Toc34, and OEP9 to the chloroplast (Bae et al., 2008; Dhanoa et al., 2010; Richardson et al., 2014). Although a number of other cytosolic receptors and chaperones for the targeting of chloroplast outer envelope proteins have also been identified (for review, see Lee et al., 2013), no cytosolic factors that specifically interact with the sorting signal of Toc159 have yet been reported. Given the observation that the CT sequences of Toc159 isoforms from *B. sinuspersici* outperformed that of *A. thaliana* in the sorting of fusion proteins to the plastid envelope (**Figures 6**, **7**), a thorough interactome study of the Toc159 CT would not only reveal additional details about the chloroplast-sorting pathway of Toc159 but also further our understanding of the mechanism of selective protein import into dimorphic chloroplasts in the single-cell C4 system.

# **ACKNOWLEDGMENTS**

This research was supported by Discovery Grants from the Natural Sciences and Engineering Research Council of Canada (NSERC) to Simon D. X. Chuong and Matthew D. Smith (Grant number 312143-2010), an NSERC Discovery Accelerator Supplement to Matthew D. Smith (Grant number 396033- 2010), and the University of Waterloo Start-Up Fund to Simon D. X. Chuong. The authors gratefully acknowledge Dr. Tzvi Tzfira (University of Michigan) for providing the pSAT6 vectors, and Tuan Hoang and Dr. Masoud Jelokhani-Niaraki (Wilfrid Laurier University) for assistance with interpretation of the CD data.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpls.2014. 00722/abstract

### **REFERENCES**


estimated energy content. *Bioinformatics* 21, 3433–3434. doi: 10.1093/bioinformatics/bti541


Inoue, K. (2011). Emerging roles of the chloroplast outer envelope membrane. *Trends Plant Sci.* 16, 550–557. doi: 10.1016/j.tplants.2011.06.005


Toc159 to the chloroplast outer membrane. *Plant Cell* 24, 1560–1578. doi: 10.1105/tpc.112.096248


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 13 July 2014; accepted: 30 November 2014; published online: 23 December 2014.*

*Citation: Lung S-C, Smith MD, Weston JK, Gwynne W, Secord N and Chuong SDX (2014) The C-terminus of Bienertia sinuspersici Toc159 contains essential elements for its targeting and anchorage to the chloroplast outer membrane. Front. Plant Sci. 5:722. doi: 10.3389/fpls.2014.00722*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Lung, Smith, Weston, Gwynne, Secord and Chuong. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A functional TOC complex contributes to gravity signal transduction in Arabidopsis

#### *Allison K. Strohm1, Greg A. Barrett-Wilt <sup>2</sup> and Patrick H. Masson1 \**

*<sup>1</sup> Graduate Program in Cellular and Molecular Biology, Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, USA <sup>2</sup> Mass Spectrometry/Proteomics Facility, University of Wisconsin—Madison, Madison, WI, USA*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Abidur Rahman, Iwate University, Japan Sarah Evelyn Wyatt, Ohio University, USA*

#### *\*Correspondence:*

*Patrick H. Masson, Laboratory of Genetics, University of Wisconsin-Madison, 425G Henry Mall, Madison, WI 53706, USA e-mail: phmasson@wisc.edu*

Although plastid sedimentation has long been recognized as important for a plant's perception of gravity, it was recently shown that plastids play an additional function in gravitropism. The Translocon at the Outer envelope membrane of Chloroplasts (TOC) complex transports nuclear-encoded proteins into plastids, and a receptor of this complex, Toc132, was previously hypothesized to contribute to gravitropism either by directly functioning as a gravity signal transducer or by indirectly mediating the plastid localization of a gravity signal transducer. Here we show that mutations in multiple genes encoding TOC complex components affect gravitropism in a genetically sensitized background and that the cytoplasmic acidic domain of Toc132 is not required for its involvement in this process. Furthermore, mutations in *TOC132* enhance the gravitropic defect of a mutant whose amyloplasts lack starch. Finally, we show that the levels of several nuclear-encoded root proteins are altered in *toc132* mutants. These data suggest that the TOC complex indirectly mediates gravity signal transduction in Arabidopsis and support the idea that plastids are involved in gravitropism not only through their ability to sediment but also as part of the signal transduction mechanism.

**Keywords: gravitropism, roots, TOC complex, plastid, signal transduction, Arabidopsis**

# **INTRODUCTION**

Root gravitropism allows plants to anchor themselves while exploring their environments to gain access to water and nutrients and to avoid obstacles and toxins. In roots, gravity sensing occurs primarily in the columella region of the cap where the cells contain dense, starch-filled amyloplasts that sediment in response to reorientation within the gravity field. Amyloplast sedimentation triggers changes in the localization of plasma membraneassociated auxin efflux facilitators, leading to the accumulation of auxin on the lower side of the root. Upon transmission to the elongation zones, the resulting auxin gradient promotes differential cellular elongation between the upper and lower flanks, resulting in downward curvature. Possible second messengers in this process include Ca2+, inositol 1,4,5-triphosphate, and protons. It is still unknown how amyloplast sedimentation leads to an auxin gradient (reviewed in Strohm et al., 2012).

Previously, we showed that ALTERED RESPONSE TO GRAVITY 1 (ARG1) is a peripheral membrane protein that is necessary for a full gravitropic response (Sedbrook et al., 1999; Boonsirichai et al., 2003). *arg1* mutants do not display the characteristic cytoplasmic alkalinization or generate an auxin gradient across the root cap upon reorientation. Expressing *ARG1* in the gravity-sensing cells (statocytes) of only the root or the shoot restores gravitropism only in that organ. Together, these data suggest that ARG1 functions in the statocytes in the early phases of gravity signal transduction. ARG1 localizes to some of the same components of the vesicle trafficking pathway as the PIN auxin efflux carriers and therefore may affect their localization or activity. However, GFP-ARG1 signal is absent from plastids (Boonsirichai et al., 2003). Because *arg1* single mutants display only a partial gravitropic defect and still respond slowly to reorientation, they were used in an enhancer screen. This approach identified *MODIFIERS OF ARG1 1* and *2* (*MAR1* and *2*), which encode components of the Translocon at the Outer envelope membrane of Chloroplasts (TOC) complex (Stanga et al., 2009). Although *mar* single mutants grow normally, *arg1 mar* double mutants show no response to gravity (Stanga et al., 2009).

TOC complexes transport nuclear-encoded proteins into plastids. These complexes consist of a pore (Toc75/MAR1), a Toc159 family receptor (Toc159, Toc132/MAR2, Toc120, or Toc90), and a Toc34 family receptor (Toc33 or Toc34/PPI3). The Toc159 family members contain an N-terminal cytoplasmic acidic domain, a GTP-binding domain, and a C-terminal membrane domain. Most of the variation between these family members occurs within the acidic domain, which has been implicated in substrate selectivity (Inoue et al., 2010). The Toc34/PPI3 members contain only the GTP-binding and membrane domains. Toc132, Toc120, Toc34, and Toc75 are thought to assemble into complexes that tend to import plastid-associated proteins not directly involved in photosynthesis, whereas the import of photosynthesis-related proteins into chloroplasts seems to be mediated mainly by complexes that include Toc159, Toc33, and Toc75 (Ivanova et al., 2004; Kubis et al., 2004).

Because ARG1 and the TOC complex localize to different parts of the cell and have different functions, it is not obvious why the corresponding mutations show a strong genetic interaction within the gravitropism signaling pathway. Several hypotheses have been proposed to explain this result (Stanga et al., 2009). In the direct interaction hypothesis, Toc132 directly functions as an amyloplast-associated ligand that interacts with an endoplasmic reticulum (ER)- or plasma membraneassociated receptor upon sedimentation onto these structures (**Figure 1A**). The genetic interaction between *TOC75* and *ARG1* is consistent with this hypothesis if Toc75 is required for the proper plastid targeting of Toc132. In the targeted interaction hypothesis, the TOC complex mediates the plastid membrane localization of a molecule that interacts as a ligand with a receptor on the ER or plasma membrane (**Figure 1B**). In the indirect interaction model, the TOC complex facilitates the plastid import of a molecule that does not physically interact with a receptor but is needed for a gravitropic response in an *arg1* background (**Figure 1C**). The work described here is aimed at testing the direct interaction hypothesis as a first step toward clarifying the role of plastids in the signal transduction cascade between amyloplast sedimentation and auxin redistribution.

#### **MATERIALS AND METHODS**

#### **PLANT MATERIALS AND GROWTH CONDITIONS**

*toc120-3* (SALK\_017374) and *ppi3-1* were provided by Paul Jarvis (Constan et al., 2004; Kubis et al., 2004), and *toc120-3* was backcrossed to WS wild type before use. *mar2-1* was previously described as a mutation in the *TOC132* gene (AT2G16640) (Stanga et al., 2009). To comply with the Arabidopsis nomenclature, we have renamed this allele *toc132-4mar*2−<sup>1</sup> in this manuscript. The *arg1- 2* and *arg1-2 toc132-4mar*2−<sup>1</sup> mutants have also been described previously (Sedbrook et al., 1999; Stanga et al., 2009).

The seeds were sterilized by washing with 95% ethanol. They were plated on half-strength buffered Linsmaier and Skoog medium containing macro- and micro-nutrients, vitamins, and 1.5% sucrose (Caisson Laboratories, North Logan, UT) supplemented with 1.5% agar type E (Sigma-Aldrich, St. Louis, MO) unless otherwise indicated. The seedlings were grown in a Conviron (Asheville, NC) TC16 growth chamber set at 22◦C and a 16 h light/8 h dark cycle. The light intensity was 50– 70µmol m<sup>−</sup>2s <sup>−</sup><sup>1</sup> and was provided by cool white fluorescent bulbs (Grainger, Lake Forest, IL).

#### **TRANSGENIC CONSTRUCTS**

The bases encoding the Toc132 GTP-binding and membrane domains (bases 1365–3618 from the start codon) were amplified from Col-0 DNA with the addition of a start codon. This region was cloned in between the AttL1 and AttL2 sites in the Gateway entry vector pENTR/D-TOPO (Life Technologies, Carlsbad, CA). An LR reaction was then performed to transfer this region into the binary vector pMDC32, which placed Toc132GM under the control of the CaMV 35S promoter and the NOS terminator (Curtis and Grossniklaus, 2003; Xu and Li, 2008).

This construct was sequenced and introduced into *arg1-2 toc132-4mar*2−<sup>1</sup> plants using the Agrobacterium-mediated floral

**FIGURE 1 | Possible models explaining the genetic interaction between** *ARG1* **and** *TOC132***. (A)** In the direct interaction model, Toc132 acts as a ligand that interacts with a receptor (green oval) on the ER or plasma membrane. The localization or the activity of the receptor is mediated by ARG1. **(B)** In the targeted interaction model, Toc132 facilitates the plastid localization of a molecule acting as a ligand (pink shape) that interacts with a receptor (green oval) on the ER or plasma membrane upon amyloplast sedimentation. The localization or activity of the receptor is mediated by ARG1. **(C)** In the indirect interaction model, Toc132 facilitates the plastid localization of a molecule that does not act as a ligand but is still required for a gravitropic response in an *arg1* background. In all three panels, the Toc33/34 receptor was omitted from the drawing for the sake of clarity and simplification of the model.

dip method (Clough and Bent, 1998). T1 transformants were selected and self-pollinated (Harrison et al., 2006), and T3 seeds likely to carry two copies of the transgene as determined by antibiotic resistance were used.

#### **ROOT REORIENTATION KINETICS**

The seeds were embedded within the medium described above supplemented with 0.7% agar. After at least 2 days of stratification, the seedlings were grown vertically as described above for 8 days. The plates were then turned horizontally, and photographs were taken at select time points. The root tip angle of each seedling at each time point was measured using Adobe Photoshop.

#### **PROTEIN EXTRACTION AND MASS SPECTROMETRY**

WS and *toc132-4mar*2−<sup>1</sup> seedlings were grown vertically for 2 weeks on medium supplemented with 0.6% agarose with one genotype grown on medium containing natural abundance ammonium nitrate and potassium nitrate (Sigma-Aldrich) and the other grown on 15N-enriched ammonium nitrate and potassium nitrate (Cambridge Isotope Laboratories, Tewksbury, MA) as previously described (Kline et al., 2010; Minkoff et al., 2012). The roots from approximately 300 seedlings per sample were dissected, combined, frozen in liquid nitrogen, and ground using a Mixer Mill 200 (Retsch, Haan, Germany). This process was repeated at non-overlapping times for a total of three trials. For each trial, each genotype was grown on both nitrogen sources for a total of six samples, each containing WS wild-type and *toc132-4mar*2−<sup>1</sup> tissue. Therefore, for each trial, one sample contained WS seedlings grown on natural abundance nitrogen media and *toc132-4mar*2−<sup>1</sup> seedlings grown on 15N-enriched media, while the second sample contained *toc132- 4mar*2−<sup>1</sup> seedlings grown on natural abundance nitrogen media and WS seedlings grown on 15N-enriched media. For each sample and in each trial, the proteins were extracted, trypsin digested, subjected to a solid-phase extraction, and analyzed on an LTQ Orbitrap XL mass spectrometer (Thermo Fisher, Waltham, MA) as previously described (Minkoff et al., 2012). Mascot software v2.2.2 (Perkins et al., 1999) was used to compare the mass spectrometra to sequences present in The Arabidopsis Information Resource protein database (Lamesch et al., 2012). The settings for Mascot searches included permitting up to two missed cleavages. Deamidation of asparagine and glutamine residues and oxidation of methionine residues were set as variable modifications. Cysteines were searched in carbamidomethylated form as a fixed modification. The precursor mass tolerance was set to 20 ppm (allowing for the selection of precursors from the monoisotopic or first or second 13C isotopes), and fragment tolerance was set to 0.5 Da. The Mascot output was filtered to a 1.0% false discovery rate using a concatenated decoy database strategy with an in-house written script. Census software (Park et al., 2008) and additional in-house scripts (http://www.biotech.wisc. edu/sussmanlab/research/supporting\_Minkoff\_2013) were used to make quantitative ratio comparisons between the 14N- and 15N-labeled signals, and the ratios were normalized to 1 based on the median of each trial. 14N/15N (light/heavy) ratios were determined for each peptide and were then averaged for all the peptides in each protein. The 14N/15N ratios of the samples in which WS wild type was labeled with 14N were compared to the 14N/15N ratios of the samples in which WS wild type was labeled with 15N using a Student's T-Test and a significance level of *p <* 0*.*1. For the proteins in which one of these ratios was greater than 1.15 and the other was less than 0.85, the ratio between the genotypes was calculated by taking the inverse of the 14N/15N ratio for the samples in which *toc132-4mar*2−<sup>1</sup> was labeled with 15N and averaging the six *toc132-4mar*2<sup>−</sup>1/WS values. The data are available in Data Sheet 1.

#### **RNA ISOLATION AND qRT-PCR**

Seedlings were grown as described for the mass spectrometry experiment using medium supplemented with natural abundance ammonium nitrate and potassium nitrate. The roots were dissected with a scalpel, and RNA was prepared using the QIAGEN RNeasy Plant Mini Kit (QIAGEN, Hilden, Germany). The RNA samples were treated with RQ1 DNase (Promega, Madison, WI) according to the instructions. This procedure was conducted at three non-overlapping times for a total of three biological replicates. cDNA synthesis and quantitative real-time PCR (qRT-PCR) were conducted simultaneously using a qScript™ One-Step qRT-PCR Kit (Quanta Biosciences, Gaithersburg, MD) as recommended. Four technical replicates per sample were included. The samples were run on a Roche Light Cycler 480 and analyzed using LinRegPCR (Ramakers et al., 2003). Reference genes with similar expression levels to the candidate genes were chosen (Czechowski et al., 2005); At1g58050 was used for AT4G23690 and GDH3, At4g27960 was used for MAJOR LATEX PROTEIN LIKE 1, and At1g13320 was used for all other genes.

# **RESULTS**

#### **MUTATIONS IN** *TOC120 AND TOC34/PPI3* **ENHANCE THE** *arg1* **PHENOTYPE**

If Toc132 specifically acts as a ligand as proposed in the direct interaction model, mutations in other TOC complex receptors are unlikely to enhance the *arg1* gravitropic defect. However, if Toc132 mediates the gravitropic response through its role in plastid protein import, then mutations in other TOC complex receptor genes such as *TOC120* and *TOC34/PPI3* may also enhance the *arg1* mutant phenotype. Therefore, we created *arg1-2 toc120-3* and *arg1-2 ppi3-1* plants and analyzed their gravitropic responses. Although the roots of these double mutants did not grow in completely random directions as did *arg1-2 toc132-4mar*2−<sup>1</sup> and *arg1-2 mar1-1* roots (Stanga et al., 2009), they showed significantly more variable root growth on hard agar surfaces relative to the single mutants (**Figure 2**). This result suggests that mutations in multiple TOC complex receptors can enhance the *arg1* gravitropic defect, and therefore, protein import via the TOC complex is important for gravitropism. These data indicate that the direct interaction model is unlikely.

#### **THE ACIDIC DOMAIN OF Toc132 IS NOT REQUIRED FOR ITS ROLE IN GRAVITROPISM**

If the direct interaction model is not supported and Toc132 does not act directly as a ligand, we expect a Toc132 construct that lacks the acidic region but still retains the GTP-binding and membrane domains (Toc132GM) to restore the gravitropic response in an *arg1 toc132* mutant background to that of *arg1* single mutants. Indeed, such a truncated protein is likely to retain its function in protein import into plastids (Inoue et al., 2010) even though it lacks the acidic domain. This domain

is the region most likely to act as a ligand because it protrudes into the cytosol and is the most divergent among the TOC complex receptors. We expressed *TOC132GM* in *arg1- 2 toc132-4mar*2−<sup>1</sup> mutant plants under the constitutive CaMV 35S promoter and found that the transformed seedlings showed similar root reorientation kinetics to *arg1-2* single mutants (**Figure 3**). This result demonstrates that the acidic domain of Toc132 is not required for a gravitropic response, and it is consistent with both the targeted and the indirect interaction models.

#### **MUTATIONS IN** *TOC132* **ENHANCE THE** *pgm1* **PHENOTYPE**

We next sought to distinguish between the targeted and indirect interaction models. In the targeted interaction model, the signal transducer imported by the TOC complex triggers signal transduction upon amyloplast sedimentation, whereas the indirect interaction model postulates a role for this plastid-localized transducer that may not rely on amyloplast sedimentation. Therefore, to determine whether the TOC complex functions in conjunction with amyloplast sedimentation, we generated *toc132-4mar*2−<sup>1</sup> *pgm1-1* double mutants. *pgm1-1* single mutants lack starch, and their amyloplasts do not sediment, although they still display some response to gravity (Caspar and Pickard, 1989; Kiss et al., 1989). We found that *toc132-4mar*2−<sup>1</sup> *pgm1-1* double mutants displayed stronger gravitropic defects than *pgm1-1* single mutants (**Figure 4**). This result is not directly compatible with the targeted interaction model, in which we would expect no enhancement of the gravitropic defect, although it does not rule it out either as the double mutant still retains some gravitropic response.

#### **THE PROTEOME AND TRANSCRIPTOME ARE ALTERED IN** *toc132* **MUTANTS**

To identify candidate proteins that might not be properly imported into plastids in *toc132* mutants, we analyzed the wildtype and *toc132-4mar*2−<sup>1</sup> proteomes of whole root tissue. A similar approach was recently used to investigate the levels of plastidassociated proteins using *toc159* whole leaf tissue (Bischof et al., 2011). We expected proteins that were not properly imported into plastids in *toc132* mutants to be degraded or to accumulate and therefore to show differences in expression between the two genotypes. We identified only one protein present at different levels between wild type and *toc132-4mar*2−<sup>1</sup> mutants that was highly likely to localize to plastids (Baginsky and Gruissem, 2009), NUCLEOSIDE DIPHOSPHATE KINASE 3 (NDPK3) (**Table 1**). However, we found 25 nucleus-encoded proteins present at different levels between wild type and *toc132-4mar*2−<sup>1</sup> that localize to other regions of the cell or whose localizations are unknown. Sixteen of these proteins were more abundant in *toc132-4mar*2<sup>−</sup>1, and nine were less abundant (**Table 1**). Nine of the proteins that were more abundant in *toc132-4mar*2−<sup>1</sup> and seven of those that were less abundant are annotated as functioning in stress responses (Provart and Zhu, 2003).

−120 ± 29, −120 ± 28, −129 ± 29, −120 ± 29, −120 ± 28, −129 ± 28, −129 ± 28, −121 ± 28, −129 ± 29, −121 ± 29, and −120 ± 28. *arg1-2* and *arg1-2 toc132-4mar*2−<sup>1</sup> [35S::Toc132GM] root tip angles were not significantly different from each other at any time point (*p >* 0*.*05, Student's *T* -Test). The error bars represent the standard error, and *n* = 10. Similar results were obtained in two additional independent experiments.

**reorientation kinetics than** *toc132-4mar***<sup>2</sup>−<sup>1</sup> and** *pgm1-1* **single mutants.** The error bars represent the standard error, and *n* = 10. The asterisks represent significant differences between *pgm1-1 toc132-4mar*2−<sup>1</sup> and the corresponding single mutants (*p <* 0*.*05, Student's *T* -Test). Similar results were obtained in two additional independent experiments.

#### **Table 1 | Proteins present at different levels in wild-type and** *toc132-4mar***<sup>2</sup>−<sup>1</sup> roots.**


*The annotations were obtained from The Arabidopsis Information Resource. The ratios between the genotypes and the associated standard errors were determined as described in the Materials and Methods section, and a Student's T-test was used to generate the p-values.*

We then used qRT-PCR to determine if some of the genes encoding these proteins are differentially expressed in *toc132- 4mar*2−<sup>1</sup> compared to wild type. The primers are shown in **Table 2**. Of the eight genes selected for analysis, four showed differences

#### **Table 2 | Primers used to quantify gene expression.**


*The sequences are shown 5 to 3 .*

at the transcript level in the same directions predicted by the proteomic analysis (**Figure 5**).

#### **DISCUSSION**

We demonstrated that mutations in multiple TOC complex components caused enhanced gravitropic defects in an *arg1* background (**Figure 2**). This result supports a model in which the TOC complex imports a molecule into plastids that is necessary for gravitropism (**Figures 1B,C**). However, we did not see completely random root growth when we mutated *TOC120* or *TOC34* in an *arg1* background. This result may have occurred due to the abilities of other receptors such as Toc132 and Toc33 to partially compensate for the loss of these proteins (Kubis et al., 2003, 2004; Ivanova et al., 2004). In any case, these results indicate that Toc132 is unlikely to function as a ligand in gravity signal transduction. This result is in contrast to our previously published result that mutations in *TOC120* do not enhance the *arg1-2* phenotype (Stanga et al., 2009). A closer examination using more seedlings revealed this subtle but consistent phenotype.

Toc159 family members differ most significantly in the lengths and sequences of their N-terminal acidic domains. Although the acidic domain is thought to help regulate the specificity of imported proteins, it has little effect on overall import capacity (Inoue et al., 2010). We showed that a truncated version of Toc132 that lacks the cytoplasmic acidic domain is capable of restoring the gravitropism response of *arg1-2 toc132-4mar*2−<sup>1</sup> seedlings back to that of *arg1-2*. Therefore, we conclude that the acidic domain is not required for Toc132's function in the gravitropic response (**Figure 3**). Because this construct still contained the GTP-binding and membrane domains, it likely retained most or all of the protein-import capability associated with full-length Toc132 (Inoue et al., 2010). We conclude that this construct likely rescued the *arg1-2 toc132-4mar*2−<sup>1</sup> phenotype because it increased the overall protein import efficiency of the TOC complex. Therefore, Toc132 likely does not directly act as a ligand in gravity signal transduction because the large cytoplasmic acidic domain, which is the most likely region of the protein to interact with a receptor, is not necessary for its gravitropic function. This result reinforces the hypothesis that the TOC complex mediates gravitropism by modulating the targeting to plastids of another

important molecule that contributes to gravity signal transduction (**Figures 1B,C**).

To further test the targeting and indirect models (**Figures 1B,C**), we examined the gravitropic response of *toc132-4mar*2−<sup>1</sup> *pgm1-1* plants and found that they displayed slightly enhanced gravitropic defects compared to the *pgm1-1* single mutant (**Figure 4**). This result suggests that *TOC132* and *PGM1* function in different genetic pathways and that the TOC complex contributes to gravitropism in a manner at least partially independent of amyloplast sedimentation. Therefore, the indirect interaction model is plausible (**Figure 1C**); however, we cannot rule out the targeted interaction model (**Figure 1B**).

In our analysis of the *toc132-4mar*2−<sup>1</sup> root proteome, we expected many plastid-localized proteins to be differentially expressed between *toc132-4mar*2−<sup>1</sup> and wild type, as previously shown for *toc159* leaf proteins (Bischof et al., 2011). However, we identified only one protein likely to localize to plastids, NDPK3, in a group of 26 differentially expressed proteins between *toc132- 4mar*2−<sup>1</sup> and wild type. This protein has also been shown to localize to mitochondria (Sweetlove et al., 2001). NDPKs have been implicated in stress and light signaling, and the related protein NDPK2 has been shown to be involved in auxin-related processes at least partly by affecting auxin transport (Choi et al., 2005). Furthermore, *NDPK3* was previously found to be redox regulated, and redox signals have been implicated in plastid-tonucleus retrograde signaling (Fey et al., 2005). Therefore, addressing a possible role for NDPK3 in gravitropism is an interesting area of future research.

The low number of plastid-localized differentially expressed proteins may be due to compensation by Toc120 or other receptors. However, we did identify many nucleus-encoded proteins present at different levels in *toc132-4mar*2−<sup>1</sup> relative to wild Strohm et al. TOC contributes to signal transduction

type, and several of the corresponding genes also showed differences at the transcript level (**Figure 5**). Plastids can regulate nuclear gene expression through retrograde signaling, especially when they are stressed (Nott et al., 2006). Indeed, many of the proteins present at different levels in *toc132-4mar*2−<sup>1</sup> are involved in stress responses. Alternatively, it is also possible that some of the expression differences we observed between mutant and wild-type roots are indirect consequences of altered cell metabolism in the mutant. Such effects could occur at the transcriptional, posttranscriptional, translational, and posttranslational levels, potentially explaining why several of the differentially expressed proteins are encoded by genes whose transcript levels are similar between mutant and wild-type roots (**Table 1** and **Figure 5**). In any case, our results suggest the possibility that altered plastid protein import results in the altered abundance of (a) nuclear-encoded protein(s) that is (are) required for a partial gravitropic response in an *arg1* background. Interestingly, the genes encoding two of the proteins in **Table 1** (MD-2-related lipid recognition domain-containing protein and MAJOR LATEX PROTEIN LIKE 6) increase in expression in root tips upon gravistimulation (Kimbrough et al., 2004).

Considered together, these experiments suggest that the direct interaction model is highly unlikely to explain the role of the TOC complex in gravitropism. Future work will determine which proteins must be imported into plastids for a normal gravitropic response in an *arg1* background and which, if any, non-plastidassociated proteins whose abundance is consequently altered are involved in this process.

#### **AUTHOR CONTRIBUTIONS**

Allison K. Strohm designed and performed the experiments and wrote the paper. Greg A. Barrett-Wilt performed the experiments and revised the paper. Patrick H. Masson designed the experiments and wrote the paper.

#### **ACKNOWLEDGMENTS**

This research was supported by grants from the National Science Foundation (IOS-0821884 and IOS-1121694) to Patrick H. Masson and a National Science Foundation Graduate Research Fellowship to Allison K. Strohm. We thank Paul Jarvis for providing us with *ppi3-1* and *toc120-3* and Katherine Baldwin for help with figures. We also thank Ben Minkoff for help with mass spectrometry and data analysis.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fpls*.*2014*.*00148/ abstract

#### **REFERENCES**


Xu, R., and Li, Q. Q. (2008). Protocol: Streamline cloning of genes into binary vectors in Agrobacterium via the Gateway® TOPO vector system. *Plant Methods* 4, 4. doi: 10.1186/1746-4811-4-4

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 January 2014; accepted: 27 March 2014; published online: 22 April 2014. Citation: Strohm AK, Barrett-Wilt GA and Masson PH (2014) A functional TOC complex contributes to gravity signal transduction in Arabidopsis. Front. Plant Sci. 5:148. doi: 10.3389/fpls.2014.00148*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Strohm, Barrett-Wilt and Masson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Evolution and targeting of Omp85 homologs in the chloroplast outer envelope membrane

# *Philip M. Day , Daniel Potter and Kentaro Inoue\**

*Department of Plant Sciences, University of California at Davis, Davis, CA, USA*

#### *Edited by:*

*Simon Gilroy, University of Wisconsin - Madison, USA*

#### *Reviewed by:*

*Inhwan Hwang, Pohang University of Science and Technology, South Korea Donna Fernandez, University of*

#### *\*Correspondence:*

*Wisconsin-Madison, USA*

*Kentaro Inoue, Department of Plant Sciences, University of California at Davis, One Shields Avenue, Davis, CA 95616, USA e-mail: kinoue@ucdavis.edu*

Translocon at the outer-envelope-membrane of chloroplasts 75 (Toc75) is the core component of the chloroplast protein import machinery. It belongs to the Omp85 family whose members exist in various Gram-negative bacteria, mitochondria, and chloroplasts of eukaryotes. Chloroplasts of Viridiplantae contain another Omp85 homolog called outer envelope protein 80 (OEP80), whose exact function is unknown. In addition, the *Arabidopsis thaliana* genome encodes truncated forms of Toc75 and OEP80. Multiple studies have shown a common origin of the Omp85 homologs of cyanobacteria and chloroplasts but their results about evolutionary relationships among cyanobacterial Omp85 (cyanoOmp85), Toc75, and OEP80 are inconsistent. The bipartite targeting sequence-dependent sorting of Toc75 has been demonstrated but the targeting mechanisms of other chloroplast Omp85 homologs remain largely unexplored. This study was aimed to address these unresolved issues in order to further our understanding of chloroplast evolution. Sequence alignments and recently determined structures of bacterial Omp85 homologs were used to predict structures of chloroplast Omp85 homologs. The results enabled us to identify amino acid residues that may indicate functional divergence of Toc75 from cyanoOmp85 and OEP80. Phylogenetic analyses using Omp85 homologs from various cyanobacteria and chloroplasts provided strong support for the grouping of Toc75 and OEP80 sister to cyanoOmp85. However, this support was diminished when the analysis included Omp85 homologs from other bacteria and mitochondria. Finally, results of import assays using isolated chloroplasts support outer membrane localization of OEP80tr and indicate that OEP80 may carry a cleavable targeting sequence.

#### **Keywords: β-barrel, chloroplast evolution, OEP80, Omp85, outer membrane, protein import, POTRA, Toc75**

# **INTRODUCTION**

Chloroplasts are derived from an endosymbiotic relationship between an ancestral cyanobacterium and a mitochondriate eukaryote which occurred around 1 billion years ago (Shih and Matzke, 2013). One piece of evidence to support a common ancestry of Gram-negative bacteria, mitochondria and chloroplasts is the presence of β-barrel proteins in their outer membranes (Inoue, 2007). Among β-barrel membrane proteins are the homologs of outer membrane protein 85 (Omp85) which appear to be present in all Gram-negative bacteria, mitochondria and chloroplasts (Voulhoux et al., 2003; Gentle et al., 2004, 2005; Voulhoux and Tommassen, 2004). A canonical member of the Omp85 family is comprised of a soluble N terminus which contains a variable number of polypeptide translocation associated (POTRA) domains, each of which usually consists of 70–90 residues, followed by a C-terminal transmembrane β-barrel (Sanchez-Pulido et al., 2003) (**Figure 1**). Results of various studies indicate that the POTRA domains are involved in association with other proteins while the β-barrel acts as an integral membrane anchor and may provide a hydrophilic pore to accommodate substrate proteins (Knowles et al., 2009; Kim et al., 2012; Simmerman et al., 2014).

Among bacterial Omp85 homologs are β-barrel assembly machinery A (BamA), two-partner secretion B (TpsB), and translocation and assembly module A (TamA). BamA is the central and essential component of the BAM complex, which catalyzes assembly of β-barrel proteins in the outer membranes (Knowles et al., 2009). BamA carries five POTRAs and has been shown to be indispensable for cell viability in *Neisseria meningitidis* and *Escherichia coli* (Voulhoux et al., 2003; Doerrler and Raetz, 2005) (**Figure 1**). TpsB and TamA are non-essential proteins found in a limited number of bacteria where they catalyze secretion of a subset of β-helical adhesion proteins known as TpsA and autotransporters, respectively (Jacob-Dubuisson et al., 2013). Canonical TpsB orthologs contain two POTRAs (Willems et al., 1994), while *E. coli* TamA contains three POTRAs (Selkrig et al., 2012) (**Figure 1**). Structural studies have revealed that each POTRA in various Omp85 homologs contains a conserved βααββ fold that forms a three-stranded β-sheet with two α-helices next to each other at one side of the sheet (Clantin et al., 2007; Kim et al., 2007; Gatzeva-Topalova et al., 2008, 2010; Knowles et al., 2008; Zhang et al., 2011). The C-terminal transmembrane β-barrels of FhaC, TamA, and BamA have been shown to consist of 16 antiparallel β-strands connected by eight extracellular

**FIGURE 1 | Molecular architecture of the Omp85 family proteins.** Omp85 homologs are comprised of an N-terminal soluble portion containing a variable number of POTRA domains (orange) followed by a C-terminal transmembrane β-barrel (green). The POTRA domains 1, 2, and 3 of the chloroplast and cyanobacteria homologs are indicated as P1, P2, and P3, respectively. Unlike the others, OEP80tr and Toc75-IV do not contain any full POTRA domains although they both contain sequences that align well with the last β-strand of P3 in their relatives. Within the β-barrel that is made of 16 transmembrane β-strands, strands 11 and 12 are separated by the 6th extracellular loop (blue) which contains a sequence highly conserved between all Omp85 homologs. The homologs from bacteria contain a signal peptide (turquoise) that is required for protein export from the cytoplasm to the periplasm. Toc75 contains a unique bipartite targeting signal (gray) at its N terminus and an apparent unique insertion at the beginning of its second POTRA domain. Cyanobacterial Omp85 homologs and TpsB contain a Pro-rich region (pink) N terminus to the first POTRA domain. In addition, TpsB contains an α-helix (brown) N terminus to the Pro-rich region. The domain lengths are approximately to scale.

loops (eLs) and seven short turns (Ts). A large extracellular loop between transmembrane β-strands (TMβ) 11 and 12 called eL6 could insert into the β-barrel pore (Clantin et al., 2007; Gruss et al., 2013; Noinaj et al., 2013; Ni et al., 2014). In FhaC, eL6 was shown to extend through the barrel to the periplasmic side of the membrane (Clantin et al., 2007), while eL6 in BamA and TamA remain in the pore via electrostatic interactions with the inner barrel wall (Gruss et al., 2013; Noinaj et al., 2013; Ni et al., 2014). Furthermore, eLs in BamA were shown to form a dome to prevent free entry or leak of molecules into or from the barrel interior (Noinaj et al., 2013; Ni et al., 2014). Structural studies have also revealed a loose association and possible opening between TMβ1 and 16 in BamA and TamA (Gruss et al., 2013; Noinaj et al., 2013) and gating of the β-barrel by the most Cterminal POTRA (POTRA5, P5) in BamA (Noinaj et al., 2013). An Omp85 homolog from a cyanobacterium *Synechocystis* sp. PCC 6803 contains three POTRAs and was shown to be essential for cell viability (Bolter et al., 1998; Reumann et al., 1999) (**Figure 1**). Crystal structural analyses confirmed the conserved folding of the POTRA domains in the Omp85 homologs from two cyanobacterial species, *Nostoc* sp. PCC7120 (Koenig et al., 2010) and *Thermosynechococcus elongatus* (Arnold et al., 2010). It was noted that most cyanobacterial Omp85 (cyanoOmp85) homologs contain a Pro-rich region preceding P1 (Arnold et al., 2010), similar to the case in TpsB (Jacob-Dubuisson et al., 2013) (**Figure 1**), although the relevance of this feature is unknown. Finally, no study has tested whether cyanoOmp85 is functionally homologous to BamA, TamA, or TpsB.

In mitochondria, only one type of Omp85 homolog has been identified. It contains one POTRA and is called sorting and assembly machinery 50 (Sam50) or topogenesis of mitochondrial outer membrane β-barrel proteins 55 (Tob55). Similar to BamA, Sam50/Tob55 is essential for outer membrane biogenesis and cell viability (Kozjak et al., 2003; Paschen et al., 2003; Gentle et al., 2004) (**Figure 1**). In contrast to the case in mitochondria, multiple Omp85 homologs are found in chloroplasts. Among them are translocon at the outer-envelope-membrane of chloroplasts 75 (Toc75) and outer envelope protein 80 (OEP80) (Hsu and Inoue, 2009). Both proteins contain three POTRAs (**Figure 1**) and are essential for viability from the embryonic stage of the model plant *Arabidopsis thaliana* (Baldwin et al., 2005; Hust and Gutensohn, 2006; Patel et al., 2008). Toc75 acts as the core component of the protein import channel (Perry and Keegstra, 1994; Schnell et al., 1994; Hinnah et al., 1997). OEP80 was given its name because its *A. thaliana* ortholog (At5g19620) was originally predicted to be a 732-amino-acid protein of 80 kD (Inoue and Potter, 2004). OEP80 is also known as Toc75-V as it is encoded in the *A. thaliana* chromosome V (Eckart et al., 2002). The exact function of OEP80 is unknown (Inoue, 2011). Interestingly, an *A. thaliana* mutant that contained a T-DNA insertion between the codons for the first and second Met of the *OEP80* gene was viable (Patel et al., 2008). A recent study using an antibody against a recombinant 80-kD protein encoded by the OEP80 cDNA and a genetic complementation assay with the *oep80-*null mutant showed that OEP80 is actually a protein of *ca.* 70 kD in *A. thaliana* chloroplasts (Hsu et al., 2012). As discussed below, both Toc75 and OEP80 appear to have evolved from ancestral cyanoOmp85, which was most likely not involved in protein import. Establishment of a protein import system must have been essential for the transition of the endosymbiont to the organelle. This is because it allowed replacement of original gene copies on the prokaryotic chromosome by duplicated copies in the eukaryotic host nucleus. Thus, the gene duplication that gave rise to Toc75 and OEP80 must have been an important event for chloroplast evolution. However, the molecular basis of the neofunctionalization of Toc75 is largely unexplored.

In addition to Toc75 and OEP80, the *A. thaliana* genome encodes several proteins that appear to be truncated Omp85 homologs (**Figure 1**). Among them, Toc75-IV (At4g09080) is encoded on chromosome IV (Jackson-Constan and Keegstra, 2001). This protein lacks the POTRA domains of Toc75 and its gene knockout mutant was viable but showed abnormal etioplast structure in seedlings and delayed de-etiolation (Baldwin et al., 2005). The *A. thaliana* genome also encodes two truncated forms of OEP80 that lack POTRA, called Ath-P1 (At3g44160; named OEP80tr in this study) and Ath-P2 (At3g48620), respectively (Moslavac et al., 2005; Topel et al., 2012). Outer membrane localization of Toc75-IV was demonstrated (Baldwin et al., 2005) while that of OEP80tr homologs remains unknown.

Previous phylogenetic analyses have established that Toc75, OEP80, and cyanoOmp85 share a common ancestor (Gentle et al., 2004; Inoue and Potter, 2004; Moslavac et al., 2005). Interestingly, however, conclusions about the evolutionary relationship between the three groups were inconsistent among studies. Some of the early studies reported a closer relationship of OEP80 with cyanoOmp85 than with Toc75 (Eckart et al., 2002; Gentle et al., 2004; Baldwin et al., 2005), while others left the relationships within the three groups unresolved (Inoue and Potter, 2004; Moslavac et al., 2005). A more recent work showed a well-supported grouping of Toc75 and OEP80 within the cyanoOmp85 clade (Topel et al., 2012). Although these studies have used different sets of sequences, the exact reason for the inconsistent conclusions remains unknown.

All known proteins in the outer membranes of chloroplasts and mitochondria are encoded in the nucleus (Hofmann and Theg, 2005; Lee et al., 2014). Among them, Toc75 is unique in that it carries a cleavable N-terminal targeting sequence known as tp75 (Tranel et al., 1995) (**Figure 1**). tp75 is comprised of two parts. The first part acts as a stroma-targeting sequence and is cleaved by a stromal processing peptidase (Tranel and Keegstra, 1996; Inoue et al., 2001). The second part of tp75 is needed for envelope targeting (Tranel and Keegstra, 1996) and is removed by plastidic type I signal peptidase 1 (Plsp1) (Inoue et al., 2005; Shipman-Roston et al., 2010; Midorikawa et al., 2014). Within the second part is a polyglycine stretch, which was shown to be necessary for preventing Toc75 from entering the stroma (Inoue and Keegstra, 2003; Baldwin and Inoue, 2006). Toc75-IV does not seem to carry a cleavable targeting sequence (Baldwin et al., 2005). It remains unknown if and how OEP80 and its truncated forms are processed and targeted to the chloroplast outer membrane.

In this study, we attempted to address unanswered questions about chloroplast Omp85 homologs; namely the molecular bases for their diversification, their phylogenetic relationships with cyanoOmp85, and their chloroplast targeting. Based on the obtained results, we have attempted to generate novel hypotheses regarding molecular mechanisms of membrane protein evolution.

#### **MATERIALS AND METHODS SEQUENCE COLLECTION, STRUCTURAL PREDICTION, AND PHYLOGENETIC ANALYSIS**

Identification numbers and sources of all the sequences used for the analyses are listed in **Table S1**. CyanoOmp85 homologs were identified by BLASTP searches against the GenBank database (www.ncbi.nlm.nih.gov/genbank) (Wheeler et al., 2003; Benson et al., 2013) using the amino acid sequence of the Omp85 homolog from *Synechocystis* sp. PCC 6803 (slr1227) (Bolter et al., 1998; Reumann et al., 1999) as a query. Among homologs identified, we selected 26 sequences from 20 species that represent diverse clades within cyanobacteria according to previous works (Tomitani et al., 2006; Criscuolo and Gribaldo, 2011). All the sequences showed E values not larger than 5E-151. The chloroplast Omp85 homologs except those from the liverwort (*Marchantia polymorpha*) and loblolly pine (*Pinus taeda*) were identified in the GenBank and Phytozome (www.phytozome. net) (Goodstein et al., 2012) databases by either TBLASTN [for pea (*Pisum sativum*), white spruce (*Picea glauca*), Douglas fir (*Pseudotsuga menziesii*), and *Nitella mirabilis*] or BLASTP (for all others) searches using amino acid sequences of *A. thaliana* Toc75 (At3g46740), OEP80 (At5g19620), and OEP80tr (Ath-P1 = At3g44160) as queries. Among the identified sequences, the pea OEP80 sequence was obtained by assembly and manual adjustments of three transcriptome shotgun assembly (TSA) sequences (Franssen et al., 2011) and the white spruce Toc75 sequence was obtained by assembly of four expressed-sequence-tag and cDNA clones (Ralph et al., 2008; Rigault et al., 2011) using the Cap3 program (http://doua.prabi.fr/software/cap3) (Huang and Madan, 1999). cDNAs encoding the liverwort chloroplast Omp85 homologs were identified by TBLASTN searches against the genome and transcriptome databases generated by the *M. polymorpha* genome-sequencing project at the Joint Genome Institute (http://www.jgi.doe.gov/) using the amino acid sequences of OEP80 and Toc75 from *A. thaliana* and *P. patens* as queries (Drs. R. Nishihama and T. Kohchi, Kyoto University). The loblolly pine Omp85 sequences were identified by TBLASTN searches using the Norway spruce genome assembly and gene expression data resource (http://congenie.org/) (Nystedt et al., 2013). For further analyses, we selected 78 sequences from 28 archaeplastida species, which showed E values of less than or equal to 1E-14 (for the sequences from green lineages) or 0.055 (for red algal sequences). We also collected sequences of the following six well-studied Omp85 homologs from the GenBank database: BamA orthologs from *N. meningitidis* and *E. coli*, *E. coli* TamA, *B. pertussis* FhaC, yeast Sam50, and *N. crassa* Tob55.

Structural predictions of *A. thaliana* Toc75 and OEP80 were done using Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2) (Kelley and Sternberg, 2009). The two dimensional diagrams were made using a program available at the website (http://blog.pansapiens.com/2008/06/26/software-review-produc ing-two-dimensional-diagrams-of-membrane-proteins/) and manual editing.

For phylogenetic analysis, the sequences were aligned using the MAFFT program (http://mafft.cbrc.jp/alignment/ software/) (Katoh and Standley, 2013). The full alignment can be found in Supplementary Material (**Figure S1**). For analyses shown in **Figures 3A,C**, poorly aligned sections that may be non-homologous or saturated with mutations were removed using the Gblocks server with setting options for a less stringent selection that allows smaller final blocks, gap positions within the final blocks, and less strict flanking positions (http://molevol. cmima. csic. es / castresana/Gblocks\_server.html) (Castresana, 2000). Depending on the analysis, various sections of the alignment were subject to Bayesian inference in the program MRBAYES (http://mrbayes.sourceforge.net/) (Huelsenbeck and Ronquist, 2001; Ronquist et al., 2012) using the mixed amino acid model option as specified by the command prset aamodelpr=mixed. For each analysis, a 50% majority-rule consensus tree was generated for the 3002 trees produced by two runs of two million generations each, sampled every 1000 generations with a 25% burn in. The percentage of trees in which a clade appeared was interpreted as the posterior probability (*PP*) of that clade.

# **cDNA CLONES**

Plasmids encoding Toc75-IV and OEP80-<sup>1</sup>−<sup>52</sup> from *A. thaliana* were in the pGEMTEasy vector (Promega, Madison, WI) (Baldwin et al., 2005; Patel et al., 2008), and the one for OEP80tr was in pUNI51 from Arabidopsis Biological Resource Center (Columbus, OH), respectively. The plasmid encoding *A. thaliana* Toc75 was generated by amplifying the coding sequences by PCR using the previously-reported plasmid (Inoue and Keegstra, 2003) as a template and transferred to an SP6 transcription vector compatible with the Gateway® cloning system (Life Technologies, Carlsbad, CA) (Joshua Endow and Kentaro Inoue, unpublished). The plasmid encoding Tic22 (pET21dpTic22) (Kouranov et al., 1999) was a kind gift of Dr. D. Schnell (University of Massachusetts, Amherst, MA).

# *IN VITRO* **CHLOROPLAST IMPORT ASSAY**

Protein import assay was done using intact chloroplasts isolated from 10- to 13-day-old pea seedlings and [35S]-labeled proteins as previously described (Inoue and Potter, 2004; Inoue et al., 2006). The radiolabeled proteins were synthesized using TNT® coupled reticulocyte lysate system (Promega) and T7 (for Toc75-IV and Tic22), T3 (for OEP80tr), SP6 (for Toc75, and OEP80) RNA polymerases with [35S]Met (Perkin Elmer, Waltham, MA). Postimport fractionation of chloroplasts was done as described (Inoue et al., 2006). For post-import protease treatment, the chloroplasts containing the imported proteins were treated with thermolysin or trypsin (both are from Sigma-Aldrich Corp, St. Louis, MO) at a 1:1 mass ratio with the amount of chlorophylls incubated in the import reaction in import buffer with (for thermolysin) or without (for trypsin) 1 mM CaCl2, respectively, for 30 min in the dark on ice (for thermolysin) or at room temperature (for trypsin). The protease reactions were quenched by adding EDTA to the final concentration of 5 mM (for thermolysin) or trypsin inhibitor at a 10:1 mass ratio of the inhibitor to the protease (for trypsin) in import buffer. For the energy-dependency assay, the reaction was done using translation products pre-treated with 50 U/mL apyrase (Sigma-Aldrich) at room temperature for 15 min.

# **RESULTS**

# **STRUCTURAL PREDICTION OF THE CHLOROPLAST Omp85 HOMOLOGS**

Recent structural studies of bacterial Omp85 homologs have revealed a conserved folding pattern of POTRA domains (Simmerman et al., 2014) and the unique organization of the transmembrane β-barrels (Clantin et al., 2007; Gruss et al., 2013; Noinaj et al., 2013; Ni et al., 2014). As a first step to gain molecular insights into the evolution of chloroplast Omp85 homologs, we made use of available data as well as alignment and prediction programs to identify residues that are conserved among them or unique to each of the subgroups.

The N-terminal soluble portion of cyanoOmp85, Toc75 and OEP80 is comprised of three POTRAs called P1, P2, and P3 (**Figure 1**). Each POTRA shows the canonical secondary structure pattern known as β1-α1-α2-β2-β3, and P3 is the most conserved POTRA domain among the three (Arnold et al., 2010). The crystal structures and sequence alignments have also revealed features of POTRAs uniquely conserved between homologs from cyanobacteria and chloroplasts (Arnold et al., 2010; Koenig et al., 2010; Simmerman et al., 2014). Our analysis identified two features that could separate Toc75 from OEP80 and cyanoOmp85. The first is found at the N terminus of P1: Val21-Leu22 in *Nostoc* Omp85 is conserved as Val62-Leu63 in OEP80 but is diverged to Tyr70-Lys71 in Toc75 [**Figures 2A,B**, indicated by (i)]. These residues contribute to the β-cap in cyanoOmp85 P1 (Arnold et al., 2010; Koenig et al., 2010). Thus, a β-cap may also be present in P1 of OEP80 but not that in Toc75. The second feature unique to Toc75 is an extraordinary long P2 which was noted previously (Simmerman et al., 2014) [**Figure 2A**, indicated by (ii)]. Our analysis revealed that this is due to the insertion of 43 to 44 residues flanked by two Cys residues (Cys256 and Cys<sup>300</sup> in *A. thaliana* Toc75; **Figure 2A**, indicated with vertical arrows). An alignment with known structures suggests that this insertion may disrupt the canonical β1 in P2 (**Figure 2A**). Generation or reduction of a disulfide bridge between the flanking Cys residues might affect stability of the three-stranded β-sheet in P2. This may lead to redox regulation of chloroplast protein import as predicted previously (Balsera et al., 2010). Interestingly, this insertion was not found in Toc75 from green and red algae (**Figure 2A**, csToc75; **Figure S1**). Another notable feature that could distinguish streptophyte Toc75 from the other Omp85 homologs can be found at β3 of P3 [**Figures 2A,B**, indicated by (iii); **Figure S1**]. Within this region is Lys386 in *A. thaliana* OEP80, which is highly conserved in OEP80 (22 out of 27 sequences examined in this study), cyanoOmp85 (15 out of 26 sequences) and the chlorophyte green algal Toc75 (5 out of 5) as well as in NmOmp85 and EcTamA. However, this Lys residue is not found in any of the Toc75 sequences from land plants examined in this study or the streptophyte green alga (*Nitella mirabilis*), where it is replaced by Gly (for 26 sequences) or Ser (for two sequences). The presence of features unique to Toc75 shows that Toc75 has continued to change throughout the evolution of plants. The overall identities of the sequences containing all the three POTRAs between *A. thaliana* OEP80 and *Nostoc/Thermosynechococcus* Omp85s are 34.3/33.3%, which are higher than that between *A. thaliana* Toc75 and cyanoOmp85 (28.5/27.6%) or *A. thaliana* Toc75 and *A. thaliana* OEP80 (22.4%). The POTRA domains have been implicated for binding with oligomeric complex partners in Gram-negative bacteria (Knowles et al., 2009). Thus, the high similarity between OEP80 and cyanoOmp85 in these domains suggests that these two proteins may have similar binding partners. The presence of conserved residues and the high overall sequence identity suggests that OEP80 may have retained the function of the protein in the ancestral cyanobacterial endosymbiont while Toc75 has taken on a new function in protein import.

For analysis of the C-terminal portion, in addition to the alignment of the primary sequences, we used the Phyre2 web server (Kelley and Sternberg, 2009) to thread sequences from *A. thaliana* Toc75 and OEP80 with that of *N. gonorrhoeae* BamA (NgBamA) whose structure has been determined by X-ray crystallography (Noinaj et al., 2013). The result allowed annotation of 16 transmembrane β-strands (TMβ1-16) connected by eight loops (eL1-8) and seven turns (T1-7) (**Figure 2C**). Among residues conserved in all three proteins are those in the motif Ser-Val-Arg-Gly-Tyr (SVRGY) in eL6, acidic residues in TMβ12 (Glu), and

**FIGURE 2 | Predicted structures of chloroplast Omp85 homologs. (A)** An alignment of the POTRA domains of two cyanoOmp85 orthologs with known structure and four chloroplast Omp85 homologs from the flowering plant *A. thaliana* (AtOEP80 and atToc75) and the green alga *Coccomyxa subellipsoidea* (CsOEP80 and csToc75). Shown above the alignment is the resolved secondary structure of Omp85 from *Nostoc* sp. PCC7120 (Koenig et al., 2010). POTRA domains 1–3 are indicated as P1-3, respectively. The horizontal arrows with β indicate β-strands. The coils show α-helices (indicated with α) or a 3–10 helix (indicated with η) found in the C terminus of the 1st helix of P2. The residues located at turns are indicated with T. Sequences conserved in OEP80 and cyanoOmp85 but diverged in Toc75 (i,iii) as well as an apparent insert at the N terminus of P2 of land plant Toc75 (ii) are indicated. The two vertical arrows below the alignment indicate the

TMβ13 (Glu or Asp), and Tyr-Ala (YA) in TMβ15 [**Figure 2C**, shown in bold; the two motifs (SVRGY and YA) are also indicated by (iv,v), respectively]. The SVRGY motif and acidic residues in TMβ12 and 13 are also conserved in cyanoOmp85 but the YA motif is not (**Figure 2D**). Of the 26 cyanoOmp85 sequences examined in this study, only three have this motif and residues found in others are not as conserved in the case of the chloroplast homologs (**Figure 2D**, panel TMβ15). eL6 has been shown to interact with P5 and have dynamic positioning within the barrel of *E. coli* BamA (Rigel et al., 2013). It was also shown in *E. coli* BamA and TamA that, when in the barrel, eL6 was in a close proximity to TMβ15 and Arg in the SVRGY motif could form a salt bridge with acidic residues in TMβ12 and 13 (Gruss et al., 2013; Noinaj et al., 2013; Ni et al., 2014). Thus, the interaction of eL6 with the β-barrel interior may be conserved in cyanoOmp85, OEP80, and Toc75. Our analysis also revealed features specific to Toc75 that may reflect its unique function. In particular, T7 between TMβ14 and 15 appears to be four residues shorter in Toc75 than that in OEP80 and cyanoOmp85 (**Figures 2C,D**). The region apparently missing in Toc75 is flanked by highly conserved sequences in both sides (**Figure 2D**, panel T7). The N-terminal flanking region is highly conserved between cyanoOmp85, Toc75 and OEP80, while the C terminus flanking region contains an LG motif, which is conserved not only in the three groups, but also in BamA (**Figure S1**).

#### **PHYLOGENETIC ANALYSES OF CHLOROPLAST Omp85 HOMOLOGS**

Because extant cyanobacteria form a monophyletic lineage, we expected that Toc75 and OEP80 evolved after the duplication of ancestral cyanoOmp85. Grouping of OEP80 and Toc75 in a clade nested within cyanoOmp85 sequences was previously reported but it had low support (66% neighbor-joining bootstrapping) positions of Cys residues flanking the insert in Toc75. **(B)** Weblogos of regions indicated as (i,iii) in panel **(A)**. **(C)** Predicted transmembrane β-strands of *A. thaliana* Toc75 and OEP80 generated using the Phyre2 server (Kelley and Sternberg, 2009). The program threaded both proteins onto the known structure of *Neisseria gonorrhoeae* BamA. The N and C termini of the proteins are indicated at left and right, respectively. Residues in the transmembrane β-strands are indicated in squares and those in turns and loops are in circles. The highly conserved SVRGY motif in eL6 and Y/FA motif in TMβ15 are indicated boldtyped and as (iv,v), respectively. Glu on TMβ12 and Asp on TMβ13 are indicated as boldtyped. The orange background shows the predicted transmembrane region. **(D)** Weblogos of the conserved sequences shown as (iv,v) in panel **(C)** and the area surrounding turn 7 (T7) of cyanoOmp85, OEP80, and Toc75 examined in this study.

(Gentle et al., 2004). However, this idea was not necessarily supported by other studies. For example, a closer relationship of OEP80 to cyanoOmp85 than to Toc75 was supported by the degree of sequence identity (Eckart et al., 2002) and a phylogenetic analysis (Baldwin et al., 2005). Inoue and Potter used amino acid sequences for a portion of the β-barrel domains, corresponding to residues 711–818 of *A. thaliana* Toc75, as well as encoding nucleotide sequences for phylogenetic analyses (Inoue and Potter, 2004). They included Omp85 homologs from chloroplasts, mitochondria, cyanobacteria, and other Gram-negative bacteria (**Table S1**). Using maximum parsimony and Bayesian inference, they found that OEP80, Toc75, and cyanoOmp85 sequences grouped together although the interior relationships of the three to one another were not resolved. Moslavac et al. also analyzed Omp85 sequences from a variety of taxa although the number of sequences was small (Moslavac et al., 2005) (**Table S1**). They used maximum likelihood to infer a phylogeny and found that Toc75, OEP80 and cyanoOmp85 form a clade but with weak support. Once again, the relationship between the three groups of proteins within this clade was unresolved. More recently, an analysis using sequences from wider taxa within cyanobacteria and archaeplastida provided strong support for the grouping of Toc75 and OEP80 nested within the cyanobacterial Omp85 clade (Topel et al., 2012). The authors performed a Bayesian analysis using an alignment of the portion corresponding to residues 651–818 of *A. thaliana* Toc75. Their ability to find support for an alternative topology was limited because they rooted the tree to the sequence from the cyanobacterium *Gloeobacter violaceus*, and no outgroup from non-photosynthetic organisms was included. Nonetheless, the obtained results were consistent with the hypothesis that Toc75 and OEP80 emerged as the result of a duplication of the cyanoOmp85 homolog that was present in the cyanobacterial endosymbiont.

In order to address the inconsistency among available results, we re-examined the phylogenetic relationships of Toc75, OEP80, and cyanoOmp85 using wider taxon selection than previous studies. We initially included well-studied Omp85 homologs from other taxa as outgroups, namely BamA, TamA, and TpsB from Gram-negative bacteria and Sam50/Tob55 from mitochondria.

We first used the alignment of sequences corresponding to residues 438–818 of *A. thaliana* Toc75. This region includes P3-β3 and the entire transmembrane β-barrel, which was shown to be conserved well among various homologs (Reumann et al., 1999). To increase the reliability of the analysis, we removed ambiguously aligned regions using the Gblocks server (Castresana, 2000). Bayesian inference using this alignment provided strong support for a monophyletic relationship of the chloroplast Omp85 homologs and cyanoOmp85 (*PP* = 0.908). The analysis grouped OEP80 with cyanoOmp85 although the support for this grouping was low (*PP* = 0.561) (**Figure 3A**). The Toc75 orthologs from the green lineages, but not those from red algae, were placed sister to this clade although support for this relationship was also low (*PP* = 0.676). Similar to the previous report, we were able to identify only one chloroplast Omp85 homolog, which belongs to the Toc75 group, from each of the two Rhodophyta species (Topel et al., 2012) (**Figure 3A**). The tree topology was not entirely consistent with the established organismal classification. Within the Toc75 group, five sequences from the chlorophyte lineage were separated into three distinct groups [**Figure 3A**, indicated by (i)], and the sequence from the basal angiosperm *Amborella trichopoda* (Amborella Genome, 2013) was grouped with those from eudicots rather than being sister to the rest of the angiosperms [**Figure 3A**, indicated by (ii)]. In addition, sequences from the moss (*Physicomitrella patens*) and liverwort (*Marchantia polymorpha*) were grouped together as a clade sister to all vascular plant sequences [**Figure 3A**, indicated by (iii)]. In the case of the OEP80 clade, the location of the *A. trichopoda* sequence was consistent with the organismal classification. However, those of the algal sequences were not; instead they formed two groups. Interestingly, sequences from the moss, liverwort, a lycophyte (*Selaginella moellendorffii*), a streptophyte green alga (*Nitella mirabilis*) and three gymnosperms [white spruce (*Picea glauca*), loblolly pine (*Pinus taeda*), and Douglas fir (*Pseudotsuga menziesii*)] formed a clade sister to angiosperm non-truncated OEP80 sequences [**Figure 3A**, indicated by (iii)]. This result does not appear to follow the generally accepted relationships where (a) gymnosperms are grouped with angiosperms forming the seed plants, (b) lycophytes are grouped within vascular plants, (c) mosses are more closely related to vascular plants than to liverworts, and (d) land plants form a monophyletic group (Bowman, 2013). This may indicate large changes within the angiosperms and the absence of the species representing the intermediate state.

We wondered if the low support for the monophyletic relationship of OEP80 and Toc75 and inconsistency of the relationships between chloroplast Omp85 homologs with organismal classifications may be due to the choice of region for the sequence alignment. Thus, we used the portion of the sequence alignment corresponding to residues 648–818 of *A. thaliana* Toc75, which includes a C-terminal β-barrel covering TMβ9 and the C terminus and is almost identical to the region used in the previous study (651–818) (Topel et al., 2012). In order to be consistent with the previous study, we used the whole alignment without removing ambiguously-aligned regions. Similar to the first tree, the obtained tree supports a clear monophyletic grouping of Toc75, OEP80 and cyanoOmp85 (**Figure 3B**). In this analysis, Toc75 and OEP80 were placed in a clade sister to cyanobacterial homologs although the support was very low (*PP* = 0.522). Once again, cyanobacteria and the green lineage formed a clade excluding red algae, this time with higher support (0.818). The erroneous topology for OEP80 from green algae and non-flowering plants as well as Toc75 from green algae, *A. trichopoda,* moss and liverwort was consistent with the previous tree in **Figure 3A** [**Figure 3B**, indicated by (i–iii)].

Despite being based on an almost identical region as the previous study (Topel et al., 2012), our tree did not provide strong support for the grouping of Toc75 and OEP80 (**Figure 3B**). We wondered if this was due to the use of a different set of sequences: the previous work used sequences only from chloroplasts and cyanobacteria (Topel et al., 2012), while we included Omp85 homologs from mitochondria and bacteria other than cyanobacteria. In order to test this idea, we used sequences only from chloroplasts and cyanobacteria at the same region used in **Figure 3A**, removed ambiguously aligned segments, and generated a new tree (**Figure 3C**). The result showed strong support for a clade containing Toc75 and OEP80 [*PP* = 1.000, indicated with an asterisk (∗)] nested within the cyanobacterial groups, similar to the previous result (Topel et al., 2012). The result suggests that the Omp85 homologs from mitochondria and bacteria other than cyanobacteria interfere with the internal relationships of homologs from cyanobacteria and chloroplasts. Interestingly, the placement of the chloroplast group in the current result was inconsistent with that in the previous work (Topel et al., 2012). The previous tree placed chloroplast homologs sister to those from *Oscillatoria sp.* PCC 6506 and *Microcoleus vaginatus*, while our analysis placed the chloroplast group sister to proteins from all cyanobacteria except for those from *Gloeobacter violaceus* and two *Synechococcus* species [**Figure 3C**, indicated with a number sign (#)]. Our result is consistent with the predicted origin of chloroplasts deep within the cyanobacteria phylogeny (Criscuolo and Gribaldo, 2011). The difference between the previous and current results may be due to the number of cyanobacterial species used for the analyses—the previous study used sequences from five species, while our study used sequences from 26 species (**Table S1**). Topologies within OEP80 and Toc75 except for those of chlorophyta OEP80 orthologs in **Figure 3C** were largely identical to those in **Figure 3A**, indicating that the presence of the Omp85 homologs from various bacteria and mitochondria does not affect the relationships within chloroplast orthologs. Unlike in the case of **Figure 3A**, the chlorophyte OEP80 orthologs are grouped together in **Figure 3C** with moderate support (*PP* = 0.763), which is consistent with organismal relationships. The red algal sequences are now grouped with Toc75 (**Figure 3C**).

In addition to OEP80 itself, the genomes of *A. thaliana* and several other angiosperms include its truncated forms such as Ath-P1 (named OEP80tr) and Ath-P2. Because we could find OEP80tr orthologs only in angiosperms (**Figure 3** and **Table S1**), we hypothesized that the duplication leading to OEP80tr occurred after the divergence of angiosperms from extant gymnosperms. However, this hypothesis is not supported by any of the trees reported in this or the previous work (Topel et al., 2012). All of the available trees placed the OEP80tr clade sister to all the "full-length" OEP80 orthologs from streptophytes

**FIGURE 3 | Phylogenetic trees of Omp85 homologs.** Sequences of chloroplast Omp85 homologs (Toc75, OEP80, and OEP80tr), cyanoOmp85, and outgroups from other bacteria and mitochondria were aligned with MAFFT. Phylogenetic inference was done using Bayesian inference. Each tree shown is the 50% majority-rule consensus tree generated for the 3002 trees produced by two runs of two million generations each, sampled every 1000 generations with a 25% burn in. **(A)** Consensus tree using an alignment of amino acid sequences corresponding to residues 438 to 818 of *A. thaliana* Toc75 with poorly aligned regions removed by Gblocks. Numbers at nodes

represent the proportion of trees in which a clade appeared, interpreted as the posterior probability (*PP*) of that clade. Discrepancies between our trees and the generally excepted plant species relationships are indicated as lower-case roman numerals (i–iii). **(B)** Consensus tree using an alignment of amino acid sequences corresponding to residues 648–818 of *A. thaliana* Toc75 with no interior section removed. **(C)** Consensus tree using an alignment of amino acid sequences excluding outgroup sequences corresponding to residues 438 to 818 of *A. thaliana* Toc75. The tree showed the chloroplast Omp85 homologs (∗) nesting within cyanobacteria (#).

(**Figure 3**). The support for this topology was modest if not very low (*PP* = between 0.74 and 0.78) (**Figure 3**). This result may suggest several independent losses of OEP80tr during evolution of streptophytes. Alternatively, this may be due to relaxed selection or extensive subfunctionalization of OEP80tr in angiosperms which caused a change in sequence large enough to confound our analysis.

#### **CHLOROPLAST TARGETING OF Omp85 HOMOLOGS**

All known chloroplast Omp85 homologs are encoded in the nuclear genome. Toc75 is produced as a larger precursor with a cleavable bipartite targeting sequence in its N terminus. The first part is required for ATP-dependent import and removed by a stromal processing peptidase (Tranel and Keegstra, 1996). The second part contains a polyglycine stretch necessary for envelope sorting (Inoue and Keegstra, 2003) and is removed by envelope-located Plsp1 (Inoue et al., 2005; Shipman and Inoue, 2009; Midorikawa et al., 2014). Toc75-IV was shown to be targeted to chloroplasts *in vitro* without any change in mobility on SDS-PAGE and its membrane integration was independent of ATP (Baldwin et al., 2005). Similarly, an 80-kD protein consisting of residues 1–732 of OEP80 was targeted to the chloroplast membrane *in vitro* without changing the mobility on SDS-PAGE, and its targeting did not require ATP (Inoue and Potter, 2004). However, in the case of OEP80, a recent study demonstrated that the N-terminal portion corresponding to residues 1–52 does not exist in endogenous OEP80, which actually migrates around 70 kD on SDS-PAGE (Hsu et al., 2012). Currently the N-terminal sequence of OEP80 is unknown. Removal of the first 52 residues would yield a protein of 74 kD, leaving a possibility open that OEP80 contains a cleavable targeting sequence. In this scenario, the mechanism of envelope targeting of OEP80 should be distinct from that of Toc75 because OEP80 does not contain a polyglycine stretch. Finally, there has been no report of chloroplast targeting of OEP80tr orthologs.

We hypothesized that comparing targeting mechanisms of chloroplast Omp85 homologs may provide us with hints about their evolution. The immediate questions we had were as follows: Is OEP80tr targeted to the chloroplast outer membrane? Does the targeting of OEP80 and OEP80tr include processing and require ATP as in the case of Toc75? To test these, we conducted import assays using radiolabeled proteins synthesized by *in vitro* transcription and translation and chloroplasts isolated from pea seedlings. For OEP80, we used the DNA construct lacking the sequence encoding the first 52 residues for *in vitro* transcription because this portion is dispensable for proper expression and functioning of endogenous *OEP80* (Patel et al., 2008; Hsu et al., 2012). After the import reaction, intact chloroplasts were re-isolated and analyzed either directly (C) or after separation by lysis and centrifugation into soluble (S1), peripheral-membrane (S2), and integral-membrane (P) fractions. The fractionation was monitored by distribution of the abundant endogenous proteins located in the stroma [the large subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (LSU)] and thylakoids [light harvesting chlorophyll a/b binding protein (LHCP)] by Coomassie Brilliant Blue staining (**Figure 4A**, panel CBB), as well as the distribution of the newly-imported peripheral membrane protein Tic22 (Kouranov et al., 1998) (**Figure 4A**, panel Tic22). In order to test the location of the imported proteins, re-isolated chloroplasts were also treated with thermolysin, which has access only to the surface of the outer membrane (Cline et al., 1981), or trypsin, which can penetrate the outer but not the inner membrane of the chloroplast envelope (Jackson et al., 1998).

As shown in **Figure 4A**, all the Omp85 homologs except OEP80 were recovered mainly in the integral membrane fraction (lane 5). Imported Toc75 was degraded partially by thermolysin (**Figure 4B**, panel Toc75, compare lanes 2 and 3) and almost completely by trypsin (**Figure 4B**, panel Toc75, compare lanes 6 and 7). Under these conditions, both OEP80tr and Toc75- IV were largely degraded by both proteases (**Figure 4B**, panels OEP80tr and Toc75-IV, compare lanes 2 and 3, 6 and 7, respectively). The near-complete degradation of Toc75-IV by trypsin in our results appears to contradict the previous report showing its partial resistance to trypsin (Baldwin et al., 2005). This apparent inconsistency may be due to the low translation and import efficiency of Toc75-IV in our study (**Figure 4B**, panel Toc75- IV). Nonetheless, the obtained results support outer membrane localization of OEP80tr.

We also tested the ATP requirement for the import of chloroplast Omp85 homologs. Excluding ATP from the reaction disrupted import of Toc75 as evidenced by the increased level of the precursor (pr) and the decreased level of the intermediate (i) and mature form (m) (**Figure 5**, panel Toc75, compare lanes 2 and 6), but did not affect membrane integration of OEP80tr and Toc75-IV (**Figure 5**, panels OEP80tr and Toc75-IV, compare lanes 5 and 9). Interestingly, apyrase treatment disrupted import of Toc75 and Toc75-IV but did not affect that of OEP80tr (**Figure 5**, compare lanes 6 and 10; **Figure S2**). Again, our result with Toc75-IV appeared to contradict the conclusion of the previous study that chloroplast-association of Toc75-IV did not require ATP (Baldwin et al., 2005). This discrepancy may be due to the use of different methods to deplete nucleoside triphosphates (NTPs) in the reaction. The previous work used gel filtration to remove NTPs from the translation products (Baldwin et al., 2005). By contrast, our assay used apyrase which should remove NTPs not only from the translation products but also from the chloroplasts used for the assay. Thus, our data suggest that a slight amount of ATP is required for Toc75-IV targeting while it is not the case for the targeting of OEP80tr.

Quite interestingly, multiple bands were generated after import of OEP80 (**Figure 4A**, panel OEP80, lane 2). The band that migrated at the slowest rate corresponded to the 74-kD translation product (indicated as x) and was recovered in all the three fractions (S1, S2, and P), while bands in the area corresponding to 66–71 kD (indicated as y) were found predominantly in the soluble fraction (S1) but were also detected in the integral membrane fraction (P) (**Figure 4A**, panel OEP80, lanes 3–5). The 74-kD protein was partially digested by thermolysin and trypsin (the band x in **Figure 4B**, panel OEP80, compare lanes 2 and 3, 6 and 7, respectively). By contrast, the proteins around 66–71 kD were resistant to both proteases (the area y in **Figure 4B**, panel OEP80, compare lanes 2 and 3, 6 and 7, respectively). Appearance of all the bands was enhanced by the presence of ATP: especially, it appeared that ATP enhanced processing of the 74-kD protein

**FIGURE 4 | Import of chloroplast Omp85 homologs** *in vitro***. (A)** Chloroplasts isolated from pea seedlings were incubated with radiolabeled proteins indicated at left in the import condition for 10 min in the light. After the reaction, chloroplasts were re-isolated and divided into two samples. The first sample was loaded directly on SDS-PAGE (C). The second sample was lysed hypotonically and separated into supernatant (S1) and pellet fractions by centrifugation. The pellet fraction was further resuspended into 0.1M Na2CO3 and separated into the supernatant (S2) and pellet (P) fractions by another centrifugation. The obtained S1, S2, and P fractions contained soluble, peripheral membrane, and integral membrane proteins, respectively. Radiolabeled proteins recovered in each fraction were separated by SDS-PAGE and visualized by phosphorimager analysis. Each lane was loaded with the sample equal to chloroplasts containing 3μg chlorophylls used for the import assay, and tl was loaded with 10% of the translation products equivalent to those used for the import assay containing 3μg chlorophyll-equivalent chloroplasts. For OEP80, the 74-kD OEP80 precursor, and the 66–71 kD import products are

to the smaller proteins around 66–71 kD (**Figure 5**, panel OEP80, compare lanes 2 and 6).

We wondered about the relevance of these findings. How are the bands of 66–71 kD produced from the 74-kD "precursor?" Do any of these smaller bands correspond to endogenous OEP80? As a first step to answer these questions, we conducted an *in vitro* import chase assay. After 10 min of the import reaction in the dark, intact chloroplasts that mainly contained the radiolabeled precursor were re-isolated, resuspended in import buffer containing ATP, and analyzed immediately or after further incubation for 30 min. The result showed a correlation between the decreased intensity of the 74-kD band mainly in the soluble and peripheral membrane fractions (S1 and S2) and the increased intensity of the 66–71-kD band mainly in the soluble fraction (S1) with a small amount in the integral membrane fraction (P) after the chase (**Figure 6A**, panel OEP80, compare lanes 2–5 and 6–9). Under the conditions used, a similar pattern was observed for Toc75: after the chase, the precursor (pr) decreased while the intermediate (i) remained roughly constant and the mature form (m) increased (**Figure 6A**, panel Toc75, compare lanes 2–5 and 6–9). Together, indicated with x and y, respectively. For others, pr, i, and m indicate precursor, import intermediate, and mature forms, respectively. The Coomassie Brilliant Blue (CBB)-stained gel for OEP80tr is shown in the bottom. The major soluble protein of 50 kD (LSU = ribulose-1,5 bisphosphate carboxylase/oxygenase large subunit) and the major membrane protein of 25 kD (LHCP = light-harvesting chlorophyll a/b binding protein) are indicated with arrowheads. **(B)** Chloroplast isolated from pea seedlings were incubated with radiolabeled proteins indicated at left in the import condition for 30 min in the light. After the reaction, chloroplasts were re-isolated and treated for 30 min without (–) or with (+) thermolysin on ice or trypsin at room temperature. Digestion control was done by including detergent (Triton X-100 = TX100) in the reaction. After the reactions were quenched with EDTA (for thermolysin) or trypsin inhibitor (for trypsin), chloroplasts were re-isolated by a 40% Percoll cushion and the radiolabeled proteins in the resultant samples were examined as described in the legend to panel **(A)**. CBB-stained gels for OEP80tr are shown. See legend to panel **(A)** for descriptions of the labels.

these results provide further support for the 66–71-kD bands being derived from the 74-kD protein of the OEP80 translation product.

To test if any of the 66–71 kD-proteins correspond to endogenous OEP80, the mobility of radiolabeled proteins produced by the import assay was compared with that of immunoreactive endogenous OEP80 in isolated chloroplasts on SDS-PAGE (**Figure 6B**). We used a partially purified antibody against recombinant OEP80 generated in the previous study (Hsu et al., 2012). As shown in **Figure 6B**, panel αOEP80, this antibody preparation did not recognize the 74-kD OEP80 translation product (lane 1) but cross-reacted with a single protein around 66 kD in pea chloroplasts (lane 2) and two proteins around 63 and 70 kD in *A. thaliana* chloroplasts [lane 5, indicated as an asterisk (∗) and a number sign (#), respectively]. Lack of an immunoreactive band in the translation product (**Figure 6B**, lane 1) was attributed to the low amount of the protein in the sample which was nonetheless sufficient for detection by autoradiography (panel [35S]). The 66-kD immunoreactive protein in pea chloroplasts was considered to be the OEP80 ortholog because it was found only in the chloroplast membrane fraction (compare lanes 3

dark. Chloroplasts were re-isolated, fractionated and examined as

described in **Figure 4A**.

and 4 in **Figure 6B**, panel αOEP80) and its mobility appeared to correspond to what was reported in another study (Eckart et al., 2002). Among the two immunoreactive *A. thaliana* proteins, the 63-kD protein is a soluble protein of unknown identity (indicated with an asterisk <sup>∗</sup> in **Figure 6B**, lane 6) and the 70 kDa membrane protein is endogenous OEP80 (indicated with a number sign # in **Figure 6B**, lane 7) as reported in the previous study (Hsu et al., 2012). Apparently, OEP80 orthologs from *A. thaliana* (70 kD) and pea (66 kD) migrated at different rates on SDS-PAGE when they were loaded separately, as clearly seen when the bands in neighboring lanes (pea OEP80 in lane 4 and *A. thaliana* OEP80 in lane 5) are compared in **Figure 6B**, panel αOEP80. Interestingly, however, the two orthologs co-migrated as a single band when the chloroplasts from *A. thaliana* and pea were loaded together (**Figure 6B**, panel αOEP80, lane 10). That is, the 66-kD pea OEP80 ortholog migrated around 70 kD together with *A. thaliana* OEP80. This may be due to the presence of endogenous proteins in *A. thaliana* chloroplasts other than OEP80 which migrated around 66 kD and affected the mobility of pea OEP80 and possibly that of *A. thaliana* OEP80. The smallest radiolabeled band from the import assay migrated around 66 kD when loaded with pea chloroplasts (included in the area y in lanes 2–4 in **Figure 6B**, panel [35S]). Notably, this protein appeared to migrate at the same rate as endogenous OEP80 in both pea and *A. thaliana* chloroplasts (**Figure 6B**, panel Overlay, white bands due to the overlap of the immunoreactive green bands and the radioactive magenta bands in lanes 2, 4, 8, and 10). Interestingly, this radiolabeled protein was found in both the soluble and membrane fractions (**Figure 6B**, panel [35S], lanes 3 and 4).

A previous study demonstrated that a C-terminal tag remains intact and does not disrupt the functionality of OEP80 *in vivo* (Hsu et al., 2012). Together, our results suggest that OEP80 may carry an N-terminal extension that is cleaved during import with the presence of ATP, and the processed form may be first released into the aqueous phase before being integrated into the membrane.

**FIGURE 6 | Processing of OEP80 during chloroplast import** *in vitro***. (A)** After 10-min incubation with radiolabeled precursors in the import condition in the dark, pea chloroplasts were re-isolated through a Percoll cushion and divided into two samples. The first sample was immediately lysed and separated into soluble (S1), peripheral membrane (S2), and integral membrane (P) fractions as described in the legend to **Figure 4A**. The second sample was resuspended in import buffer containing 3 mM MgATP for 30 min at room temperature in the light then separated into the three fractions. The OEP80 precursor of 74 kD and imported products ranging from 66–71 kD are labeled as x and y; precursor, intermediate and mature forms of Toc75 are indicated with pr, i, and m, respectively. **(B)** Intact chloroplasts isolated from pea seedlings and incubated with radiolabeled OEP80 in the import condition for 30 min in light (i) or chloroplasts isolated from *A. thaliana* seedlings (ii) were analyzed directly (C) or lysed hypotonically and separated into soluble (S) and pellet (P) fractions by centrifugation. Proteins in each fraction were separated by SDS-PAGE and transferred to a PVDF membrane. The membrane was incubated with the partially-purified anti-OEP80 antibody and the immunoreactions were visualized by colorimetric assay using alkaline phosphatase (αOEP80), and the radiolabeled proteins in the membrane were visualized by phosphorimager analysis ([35S]). The two images were overlaid using ink spots and radiolabeled marking spots (Overlay). In Overlay, radioactive *(Continued)*

#### **FIGURE 6 | Continued**

signals are indicated with magenta and the immunoreactive signals with green. The lane tl was loaded with translation products and pre-stained (blue) molecular weight markers. Endogenous OEP80 of 70 kD and the soluble 63-kD protein of unknown identity in the *A. thaliana* chloroplasts recognized by the anti-OEP80 antibody are indicated with a number sign (#) and an asterisk (∗); the OEP80 precursor of 74 kD and imported products ranging from 66–71 kD are labeled as x and y, respectively.

# **DISCUSSION**

The chloroplast originated from a cyanobacterial endosymbiont. The initial phase of chloroplast evolution most likely involved duplication and transfer of genes from the endosymbiont chromosome to the host nuclear genome. Establishment of a protein import system at the outer and inner membranes of the endosymbiont must have been needed to allow replacement of gene copies in the endosymbiont with the duplicated copies in the host, leading to the conversion of the cyanobacterial endosymbiont to the organelle. Available data suggest that ancestral cyanoOmp85 evolved into two essential proteins in chloroplasts, i.e., the core component of the protein import channel, Toc75, and a protein of unknown function, OEP80. Furthermore, the *A. thaliana* genome encodes truncated forms for the chloroplast Omp85 homologs. Results of sequence alignments and structural predictions presented in this study suggest that OEP80 may have a function similar to that of cyanoOmp85. Comprehensive phylogenetic analyses provide support for the grouping of OEP80 and Toc75 as a clade sister to the one including most cyanoOmp85 homologs as long as the analysis used sequences only from cyanobacteria and chloroplasts and the trees were rooted within cyanobacteria. The obtained results also suggest extensive diversification of Omp85 homologs that may interfere with reconstructions of their phylogenetic relationships. Finally, results of our import assay suggest that both Toc75 and OEP80 are processed post-translationally and their import requires ATP, while the truncated forms are integrated into the chloroplast membrane without processing although the energy requirement of the integration appears to differ between Toc75-IV and OEP80tr. Together, the current data suggest that chloroplast Omp85 homologs have diversified in both function and targeting multiple times throughout the evolution of land plants.

Our results suggest that the unique occlusion of the β-barrel by eL6 shown in bacterial Omp85 structures is conserved in cyanoOmp85 and chloroplast Omp85 homologs. This appears to contradict the idea that Toc75 provided a pore for free transport of precursor proteins. It is possible that eL6 may not reach the pore, or its insertion may be dynamic and tightly regulated in Toc75. This idea may be tested by a combination of site-directed mutagenesis and a genetic complementation assay established previously (Shipman-Roston et al., 2010) and crystal structural analysis, which has never been done with chloroplastic Omp85 homologs. Results of these experiments should give valuable clues as to how the Omp85 homolog present in the ancestor of chloroplasts was converted to a chloroplast protein import channel, and how the import channel works.

Strong support for an OEP80-Toc75 clade nested in cyanobacteria was obtained only when distantly-related sequences were excluded. This may be explained by the extensive diversification of Omp85 because the outgroup sequences we used had very low sequence identity with the cyanobacteria and chloroplast sequences (between 14–27% in the region used for phylogenetic analysis). To some degree this justifies their exclusion. Also, the specification of *Gloeobacter* as the root of the cyanobacteria and chloroplast tree is reasonable so long as we assume that both Toc75 and OEP80 are of cyanobacterial origin. One argument against this approach, however, is that this specific assumption is one of the questions that were being tested. Even with the sequences from mitochondria and proteobacteria, our results showed that the chloroplast and cyanobacterial sequences form a clade. However, we cannot deny a possibility that either Toc75 or OEP80 entered plant genomes through horizontal gene transfer from a group of bacteria, which are closely related to but distinct from extant cyanobacteria.

The number of chloroplast Omp85 homologs in the model plant *A. thaliana* has increased as a result of duplications of Toc75 and OEP80 which gave rise to the homologs that lack POTRA domains during species diversification. Generation of Toc75-IV occurred recently, most likely shortly before the divergence of the Arabidopsis genus. By contrast, the duplication of OEP80 leading to OEP80tr homologs occurred at some point before the divergence of angiosperms or possibly that of land plants. A previous study showed a possible involvement of Toc75-IV in non-photosynthetic plastid development (Baldwin et al., 2005), while the function of OEP80tr remains unknown. The presence of OEP80tr in angiosperms, but not in other land plants, suggest that its biological role may be related to the function or the development of plastid types unique to angiosperms. Extensive analysis of publicly-available *OEP80tr* T-DNA insertion mutants should help address this question and ultimately establish the relevance of the duplication event for generating POTRA-less Omp85, which is unique to chloroplasts.

Results of the *in vitro* import assay suggest that OEP080tr was targeted to the outer membrane without a cleavable targeting sequence. Although the size and localization of the endogenous protein need to be tested using a specific antibody, the obtained result suggests that Toc75-IV and OEP80tr may use a similar mechanism for membrane targeting and integration although their energy requirement appears to differ. In addition, our results suggest that OEP80 may carry a cleavable targeting sequence in its N terminus. If so, it would be the second organelle outer membrane protein known to require a cleavable targeting signal. Available data suggest that Toc75 is never dissociated from the membrane, while our result suggests that OEP80 import may involve soluble intermediates. This could be due to the presence of the polyglycine stretch that may act as a membrane anchor in the case of Toc75. Also, if OEP80 goes through a soluble intermediate that is protected from proteases, it must be approaching the outer membrane from the inside of the chloroplast outer membrane. If the orientation of a β-barrel membrane protein is determined by the direction it approaches the membrane, then the orientation of OEP80 should be the same as bacterial Omp85 homologs, i.e., facing both N and C termini to the space between the outer and inner membranes. This hypothesis is at odds with a previous study, which showed that fluorescence from split GFP was seen at the edge of chloroplasts when a recombinant protein comprised of OEP80 with the 11th β-strand of GFP at its N terminus was co-expressed with cytosolically targeted GFP β-strands 1–10 (Sommer et al., 2011). They used an OEP80 construct that contained the N-terminal 52 amino acids that actually does not exist in the mature protein (Hsu et al., 2012). This extra section as well as the added portion of GFP may have prevented proper import of OEP80, thus leaving the entire protein in the cytosol. These ideas are still preliminary, and need to be tested by extensive *in vitro* and *in vivo* analyses.

# **AUTHOR CONTRIBUTIONS**

Philip M. Day and Kentaro Inoue designed the experiments. Philip M. Day conducted most of the experiments. Daniel Potter conducted phylogenetic analysis. Philip M. Day and Kentaro Inoue wrote the paper. All authors were involved in finalization of the paper.

### **ACKNOWLEDGMENTS**

We thank Drs. Ryuichi Nishihama and Takayuki Kohchi for identifying sequences of chloroplast Omp85 homologs from *M. polymorpha*, the U.S. Department of Energy Joint Genome Institute for providing the genome and transcriptome sequences of *M. polymorpha*, Dr. Danny Schnell for the plasmid encoding Tic22, and Joshua Endow for critical reading of the manuscript. Philip M. Day was a recipient of the UC Davis Plant Sciences Graduate Student Research Assistantship, which supported part of his stipend, tuition and fee. All other analyses and import experiments were supported by the Division of Molecular and Cellular Biosciences at the US National Science Foundation (Grant no.1050602) to Kentaro Inoue.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpls.2014.00535/ abstract

#### **Table S1 | Sequences used for this study.**

#### **Figure S1 | An alignment of sequences used for structure prediction and phylogenetic analysis.**

**Figure S2 | Energy requirement for import of truncated chloroplast Omp85 homologs** *in vitro***.**

### **REFERENCES**


assembly machinery of the mitochondrial outer membrane. *J. Biol. Chem.* 278, 48520–48523. doi: 10.1074/jbc.C300442200


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 June 2014; accepted: 19 September 2014; published online: 13 October 2014.*

*Citation: Day PM, Potter D and Inoue K (2014) Evolution and targeting of Omp85 homologs in the chloroplast outer envelope membrane. Front. Plant Sci. 5:535. doi: 10.3389/fpls.2014.00535*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Day, Potter and Inoue. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Monogalactosyldiacylglycerol synthesis in the outer envelope membrane of chloroplasts is required for enhanced growth under sucrose supplementation

## *Masato Murakawa1†, Mie Shimojima2†, Yuichi Shimomura1, Koichi Kobayashi 3, Koichiro Awai 4,5 and Hiroyuki Ohta2,6,7\**

*<sup>1</sup> Graduate School of Biological Sciences, Tokyo Institute of Technology, Yokohama, Japan*

*<sup>2</sup> Center for Biological Resources and Informatics, Tokyo Institute of Technology, Yokohama, Japan*

*<sup>3</sup> Graduate School of Arts and Sciences, Tokyo University, Tokyo, Japan*

*<sup>4</sup> Graduate School of Science, Shizuoka University, Shizuoka, Japan*

*<sup>5</sup> JST PREST, Tokyo, Japan*

*<sup>6</sup> Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, Japan*

*<sup>7</sup> JST CREST, Tokyo, Japan*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Peter Doermann, Universitaet Bonn, Germany Rebecca Roston, Michigan State University, USA*

#### *\*Correspondence:*

*Hiroyuki Ohta, Center for Biological Resources and Informatics, Tokyo Institute of Technology, 4259-B65 Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan*

*e-mail: ohta.h.ab@m.titech.ac.jp*

*†These authors have contributed equally to this work.*

Plant galactolipid synthesis on the outer envelope membranes of chloroplasts is an important biosynthetic pathway for sustained growth under conditions of phosphate (Pi) depletion. During Pi starvation, the amount of digalactosyldiacylglycerol (DGDG) is increased to substitute for the phospholipids that are degraded for supplying Pi. An increase in DGDG concentration depends on an adequate supply of monogalactosyldiacylglycerol (MGDG), which is a substrate for DGDG synthesis and is synthesized by a type-B MGDG synthase, MGD3. Recently, sucrose was suggested to be a global regulator of plant responses to Pi starvation. Thus, we analyzed expression levels of several genes involved in lipid remodeling during Pi starvation in *Arabidopsis thaliana* and found that the abundance of *MGD3* mRNA increased when sucrose was exogenously supplied to the growth medium. Sucrose supplementation retarded the growth of the *Arabidopsis* MGD3 knockout mutant *mgd3* but enhanced the growth of transgenic *Arabidopsis* plants overexpressing MGD3 compared with wild type, indicating the involvement of MGD3 in plant growth under sucrose-replete conditions. Although most features such as chlorophyll content, photosynthetic activity, and Pi content were comparable between wild-type and the transgenic plants overexpressing MGD3, sucrose content in shoot tissues decreased and incorporation of exogenously supplied carbon to DGDG was enhanced in the MGD3-overexpressing plants compared with wild type. Our results suggest that MGD3 plays an important role in supplying DGDG as a component of extraplastidial membranes to support enhanced plant growth under conditions of carbon excess.

**Keywords: galactolipid, monogalactosyldiacylglycerol, MGDG, phosphate deficiency, sucrose**

#### **INTRODUCTION**

Phosphate (Pi) depletion is a serious problem for plant growth worldwide (Lynch, 2011; Kochian, 2012). Among several defense mechanisms that plants use to survive under conditions of Pi depletion, membrane lipid remodeling is common for various plants (Andersson et al., 2003, 2005; Gaude et al., 2004; Jouhet et al., 2004; Benning and Ohta, 2005; Russo et al., 2007; Tjellström et al., 2008; Lambers et al., 2012; Shimojima et al., 2013). In plant cells, ∼50% of membrane lipids are composed of galactolipids, which are distinct from biological membranes in mammalian cells (Block et al., 1983; Joyard et al., 1998). Galactolipids are synthesized in chloroplasts and are predominantly located in the thylakoid membranes of chloroplasts under normal growth conditions (Block et al., 1983; Joyard et al., 1998). However, under Pi depletion, phospholipids are degraded to supply Pi for other essential biological processes, whereas galactolipids substitute for phospholipids in extraplastidial membranes (Essigmann et al., 1998; Härtel and Benning, 2000; Andersson et al., 2003, 2005; Jouhet et al., 2004; Nakamura, 2013).

Plants have two major types of galactolipids, namely monogalactosyldiacylglycerol (MGDG) and digalactosyldiacylglycerol (DGDG). MGDG and DGDG constitute ∼50 and ∼30% of chloroplast membrane lipids, respectively (Block et al., 1983; Joyard et al., 1998). These galactolipids are synthesized in the chloroplast envelope membrane (Douce, 1974). The type-A MGDG synthase, MGD1, in *Arabidopsis thaliana* (*At*MGD1) localizes on the inner envelope membrane and catalyzes the bulk of MGDG synthesis in chloroplasts (Awai et al., 2001; Xu et al., 2005). MGD1 is essential for MGDG synthesis and subsequent DGDG synthesis under normal growth conditions. Indeed, a knockdown mutant of *AtMGD1*, *mgd1-1*, exhibits lower chlorophyll content and photosynthetic activity compared with wild type (WT) and has a defect in thylakoid membrane development (Jarvis et al., 2000; Aronsson et al., 2008). Moreover, a knockout mutant of *AtMGD1*, *mgd1-2*, in which MGDG is decreased by 98% compared with WT, shows a severe defect in embryogenesis shows a severe defect in both embryogenesis and thylakoid membrane development (Kobayashi et al., 2007, 2012).

The other isoforms of MGDG synthase, the type-B MGDG synthases of *Arabidopsis*, namely *At*MGD2 and *At*MGD3, localize on the outer envelope membrane (Awai et al., 2001). Genes for type-B MGDG synthases in *Arabidopsis* were first discovered as the paralogs of *AtMGD1* (Awai et al., 2001). Under nutrientreplete growth conditions, expression of genes encoding type-B MGDG synthase is very low in vegetative tissues but is markedly upregulated during Pi starvation (Awai et al., 2001; Kobayashi et al., 2004). Unlike the case of the *AtMGD1* knockout mutant, the single-knockout mutant of *MGD2* or *MGD3* and the doubleknockout mutant do not show a decrease in MGDG or DGDG production or any other particular phenotype different from WT plants grown under normal growth conditions (Kobayashi et al., 2009a). However, the *Arabidopsis mgd3* and *mgd2mgd3* mutants display severe growth retardation and a decrease in DGDG content under Pi depletion, clearly indicating that MGD3-mediated MGDG synthesis has an essential role in survival when Pi is scarce (Kobayashi et al., 2009a). Indeed, recent comprehensive phylogenetic analyses of genes that encode type-B MGDG synthases indicated that the family members are widely distributed in seed plants, suggesting that these genes might have been essential for plants to adapt to Pi deficiency (Kobayashi et al., 2009b; Ohta et al., 2012; Yuzawa et al., 2012). MGDG produced by type-B MGDG synthases is sequentially supplied as a substrate to the DGDG synthases, DGD1 and DGD2 (Härtel et al., 2000; Kelly and Dörmann, 2002; Kelly et al., 2003). The bulk of DGDG produced by galactolipid synthesis via type-B MGDG synthases is transferred to and accumulates in extraplastidial membranes, such as vacuoles, mitochondria, and the plasma membrane, and the presence of DGDG can substitute for phosphatidylcholines (PCs) (Essigmann et al., 1998; Härtel and Benning, 2000; Andersson et al., 2003, 2005; Jouhet et al., 2004; Nakamura, 2013). The sequential events of galactolipid synthesis mainly occur on the outer envelope membranes of chloroplasts where the three enzymes MGD2, MGD3, and DGD2 are localized. Galactolipid synthesis in the outer envelope membrane may be advantageous for transferring DGDG to extraplastidial membranes at the site of contact (Jouhet et al., 2004).

The plant response to exogenously supplied sucrose is similar to its response to Pi starvation. For example, several genes encoding enzymes in carbohydrate metabolism are transcriptionally regulated by Pi content in plants (Nielsen et al., 1998; Ciereszko et al., 2001a,b), and the expression level of a Pi transporter is increased upon supplementation with sucrose (Lejay et al., 2003). Transcript profiling and expression analyses have revealed that the expression of many genes involved in sucrose metabolism is affected by Pi starvation, suggesting that there are interactions between Pi- and sugar-dependent gene regulation (Hammond et al., 2003; Vance et al., 2003; Wu et al., 2003; Misson et al., 2005; Müller et al., 2005, 2007; Hammond and White, 2008). The *Arabidopsis hypersensitive to phosphate starvation1* (*hsp1*) mutant, which overexpresses the sucrose transporter gene, *SUC2*, shows a Pi-starvation-like phenotype even under Pi sufficiency (Lei et al., 2011). Lei et al. (2011) showed that *MGD3* was one of the genes which expression levels were increased in *hsp1* mutant. Although the balance between carbon and Pi content in plant tissues is likely to be important for plant growth, the details remain unknown. Indeed, we have not tested if sucrose supplementation to the growth medium could affect Pi-starvation induced MGDG synthesis. MGDG synthesis on the outer envelope membrane in chloroplasts is known to be important for supplying DGDG, which can substitute for PC in the extraplastidial membranes. Meanwhile, plant growth is known to be enhanced when sucrose is supplied to the growth medium, but the mechanism has been not fully unraveled. The aim of this study was to show the possibility that increased supply of DGDG via upregulation of type-B MGDG synthesis by exogenously supplied sucrose and export of DGDG to the extraplastidial membranes could be partially involved in the enhanced growth when sucrose was supplied to plants. Here we produced *Arabidopsis* transgenic plants overexpressing *AtMGD3* and analyzed the effect of MGDG overproduction on the function of the chloroplast outer envelope membrane under normal growth conditions with or without sucrose supplementation.

# **RESULTS**

#### **EXPRESSION OF** *MGD2* **AND** *MGD3* **INCREASES UPON SUCROSE SUPPLEMENTATION**

Expression of *MGD1*, *MGD2*, *MGD3*, *DGD1*, *DGD2*, *NPC5*, *SUC2*, *IPS1*, and *At4* was assessed using *Arabidopsis* plants grown on Murashige and Skoog (MS) agar supplemented with sucrose or the same molar concentration of mannitol as the osmotic control (**Figure 1**). *SUC2* expression in shoots was higher when sucrose was supplied (**Figure 1A**). Sucrose supplementation also increased the expression of type-B MGDG synthase genes, *MGD2* and *MGD3*, by ∼2-fold compared with expression in the absence of sucrose. *NPC5* encodes non-specific phospholipase C5, which hydrolyzes PC (Gaude et al., 2008), and *DGD1* and *DGD2* encode proteins that synthesize DGDG (Härtel et al., 2000; Kelly and Dörmann, 2002; Kelly et al., 2003). Although *NPC5*, *DGD1*, and *DGD2* are responsive to Pi starvation and are involved in membrane lipid remodeling under Pi depletion (Härtel et al., 2000; Kelly and Dörmann, 2002; Kelly et al., 2003; Gaude et al., 2008), their expression in shoots was not significantly affected by sucrose supplementation (**Figure 1A**). *IPS1* and *At4* are nonprotein-coding genes that are strongly and specifically induced by Pi starvation but are not related to lipid synthesis (Martín et al., 2000; Rubio et al., 2001; Bari et al., 2006; Narise et al., 2010). Expression of both *IPS1* and *At4* in shoots was increased by sucrose supplementation.

In roots, expression profiles differed slightly from those in shoots (**Figure 1B**). Expression of *MGD2* and *MGD3* increased markedly in roots when sucrose was supplied. Moreover, expression of *DGD2* and *NPC5* increased when sucrose was supplied, whereas sucrose supplementation did not alter the expression of *SUC2*, *IPS1*, or *At4*. These results suggested that sucrose and Pi may regulate translation in a mutually exclusive manner and be organ specific.

#### **SUCROSE SUPPLEMENTATION RETARDS GROWTH OF A** *MGD3* **KNOCKOUT MUTANT COMPARED WITH WT**

Supplementation of growth medium with sucrose promotes plant growth (Karthikeyan et al., 2007). To clarify whether type-B MGDG synthases are involved in the growth enhancement observed under sucrose supplementation, fresh weight of shoots and roots of a *MGD3* knockout mutant (*mgd3*) (Kobayashi et al., 2009a) was compared with that of WT (**Figure 2**). Without sucrose supplementation, the fresh weight of roots and shoots of *mgd3* was similar to that of WT (**Figures 2A,B**). When sucrose was supplied, however, fresh weight of shoot and root of *mgd3* was ∼10 and ∼14% lower compared with WT, respectively (**Figures 2A,B**). These data indicated that MGD3 plays a role in seedling growth under sucrose supplementation.

#### **GENERATION OF TRANSGENIC** *Arabidopsis* **PLANTS OVEREXPRESSING** *At***MGD3**

To determine the role of MGD3 in plant growth under sucrose supplementation, we produced transgenic *Arabidopsis* plants that overexpressed *At*MGD3 fused to green fluorescent protein (GFP) and selected two overexpression (OE) lines (OE3 and OE7) for further analyses. In shoots, *MGD3* mRNA levels in OE3 and OE7 were ∼130- and ∼480-fold higher, respectively, than in WT (**Figure 3A**). In roots, *MGD3* mRNA levels in OE3 and OE7 were ∼100- and ∼450-fold higher, respectively, than in WT (**Figure 3A**).

Using antibodies against GFP, we also analyzed protein levels and subcellular localization in these transgenic plants (**Figures 3B–D**). In both OE3 and OE7, *At*MGD3-GFP was expressed in shoots and roots (**Figure 3B**). Moreover, the levels were higher in OE7 than in OE3. Thus, we used OE7 for further analyses. The subcellular localization of *At*MGD3-GFP in OE7 plants was analyzed after fractionation of the crude

sucrose (+suc) or without sucrose [0.53% (w/v) mannitol as the osmotic control; −suc] for 7 d were then transferred to ½MS agar with or without sucrose, respectively, for 7 d. Relative mRNA abundance of genes upregulated during Pi starvation (*MGD2*, *MGD3*, *DGD1*, *DGD2*, *NPC5*, *IPS1*,

and *At4*), a sucrose transporter gene (*SUC2*), and a Pi-non responsive galactolipid synthase gene (*MGD1*) in shoots **(A)** and roots **(B)** was analyzed by quantitative RT-PCR. Relative expression was normalized to the mRNA abundance of each respective gene under conditions without sucrose (−suc). Values represent the mean ± SD from three independent measurements. <sup>∗</sup>*P* < 0.05.

extract of shoots. In **Figure 3C**, crude extract was first centrifuged at low speed to obtain a thylakoid membrane–enriched fraction (**Figure 3C**, lane 1), and then the supernatant (**Figure 3C**, lane 2) was centrifuged at high speed to obtain two distinct fractions—the soluble fraction, which is enriched with soluble proteins (**Figure 3C**, lane 4), and the membrane fraction, which is enriched with microsomal membranes (**Figure 3C**, lane 3). The results showed that *At*MGD3-GFP mainly localized in microsomal membranes which is enriched with envelope membrane of chloroplasts and extraplastidial membranes (**Figure 3C**, lane 3). In **Figure 3D**, crude extract of OE7 (**Figure 3D**, lane 1) was centrifuged at low speed to obtain a chloroplast–enriched fraction (**Figure 3D**, lane 3) and the supernatant in which soluble proteins, chloroplast envelope membranes and extraplastidial membranes are major components (**Figure 3D**, lane 2). In **Figure 3D**, one of the light-harvesting chlorophyll a/b-binding (LHCB) proteins, LHCB6, was used as a marker protein of thylakoid membranes. Most of the *At*MGD3-GFP proteins localized in the envelope membrane-enriched fraction (**Figure 3D**, lane 2), but still small amount of *At*MGD3-GFP proteins was observed in the chloroplast-enriched fraction (**Figure 3D**, lane 3).

We also analyzed the galactolipid synthetic activity using the microsomal fractions obtained from WT and OE7 plants (**Figure 3E**). 14C-labeled UDP-galactose was used as a substrate for the galactolipid synthesis. As a result, microsomal fraction of OE7 plants showed higher levels of 14C-labeled DGDG compared with that of WT, whereas levels of 14C-labeled MGDG were not significantly different between WT and OE7 (**Figure 3E**). Regarding that MGDG produced by MGD3 on the outer envelope membrane of chloroplasts is sequentially supplied as a substrate to DGDG synthases, the result suggested that the localization of *At*MGD3-GFP is the outer envelope membrane as previously reported (Shimojima et al., 1997; Awai et al., 2001).

#### **SHOOT FRESH WEIGHT OF OVEREXPRESSING** *At***MGD3 TRANSGENIC PLANTS IS HIGHER THAN THAT OF WT WHEN SUCROSE IS SUPPLIED**

When grown on medium supplemented with sucrose, shoot fresh weight of OE3 and OE7 was ∼13 and ∼14% higher than that of WT, respectively (**Figures 4A,B**). When grown on medium without sucrose, however, shoot fresh weight was comparable between WT and OE7 (**Figures 4A,B**), whereas OE3 showed ∼26% higher shoot fresh weight compared with WT. Root fresh weight was similar between WT and OE7 regardless of sucrose availability (**Figure 4C**), whereas OE3 showed higher root fresh weight compared with WT when sucrose was not supplemented to the growth medium (**Figure 4D**). From these results, it was suggested that OE of MGD3 enhances plant growth regardless of sucrose supplementation to the growth medium.

#### **RELATIVE AMOUNT OF DGDG IN MEMBRANE LIPIDS IS HIGHER IN OE7 THAN IN WT IN BOTH SHOOTS AND ROOTS**

Although the effect of Pi starvation on membrane lipid composition has been well studied, the composition under sucrose supplementation has not been assessed. Thus, we assessed membrane lipid composition in shoots and roots of WT and OE7 plants grown on medium containing sucrose (**Figure 5A**). In shoots of WT plants, the molar ratio of DGDG in the total membrane lipids was increased when sucrose was supplemented, whereas that of MGDG was decreased (**Figure 5A**). Although the level of *MGD3* mRNA in OE7 under sucrose supplementation was markedly higher than that observed in WT plants grown under Pi depletion (**Figure 3A**; Narise et al., 2010), only a small increase in DGDG mol% was observed in OE7 shoots compared with WT shoots (**Figure 5A**). Thus, we also analyzed lipid composition of the microsomal fraction extracted from shoots of WT and OE7 supplemented with sucrose (**Figure 5B**). Microsomal fraction was obtained using the same method as described in **Figure 3C**. In WT plants grown under normal growth conditions, MGDG mainly localizes in the thylakoid membrane and the molar ratio of MGDG and DGDG is 2:1 as shown in **Figure 5A**. In the microsomal fraction, ratio of MGDG in the total lipids (∼16 mol%) were comparable between WT and OE7 (**Figure 5B**), indicating that the amounts of thylakoid membrane included in the fractions were similar between WT and OE7. However, microsomal fraction of OE7 contained higher amount of DGDG (∼13 mol%) than that of WT (∼8 mol%, **Figure 5B**). From these results, it was suggested that DGDG increased in OE7 plants was translocated to extraplastidial membranes. The increase in DGDG mol% in roots was more significant than that in shoots when compared OE7 with WT (**Figure 5C**). When sucrose was supplied to WT plants, transcript levels of *NPC5* and *DGD2* were significantly increased in roots but not in shoots (**Figures 1A,B**). These results together suggested that the marked increase in DGDG mol% in plants requires a simultaneous increase in transcript levels of not only *MGD2/MGD*3 but also *NPC5* and *DGD2*. Regarding the membrane lipid composition under Pi depletion that we used in this experiment, it is known that DGDG mol% increases whereas PC and phosphatidylethanolamine (PE) molar ratios decrease (Kobayashi et al., 2009a). Under sucrose supplementation, DGDG mol% increased, but a decrease in molar ratios of PC and PE was not observed in either shoots or roots (**Figures 5A,C**). We also analyzed the amount of total fatty acids, and confirmed

and roots). <sup>∗</sup>*P* < 0.05.

supplemented with +suc or −suc for 7 d were then transferred to ½MS agar with or without sucrose, respectively, and grown for

plants for OE7 roots, and *n* = 20 groups of 3–5 plants for OE3 shoots

that there was no difference of the amount of membrane lipids between WT and OE7 supplemented with or without sucrose (**Figure 5D,E**).

#### **FREE INORGANIC PI CONTENT IS COMPARABLE BETWEEN WT AND OE7 PLANTS**

Under Pi depletion, plant cells utilize phosphorus by degrading phospholipids in biological membranes, and DGDG compensates for loss of phospholipids in the membranes; thus, increased amounts of available Pi in cells are utilized for other essential biological processes. To test if the increase in DGDG mol% might affect the levels of available Pi in cells without changing phospholipid levels, we measured free inorganic Pi content in WT and OE7 plants (**Figure 6A**). In both shoots and roots, WT and OE7 contained the same amount of Pi, clearly showing that the increase in DGDG mol% in the membrane does not affect the concentration of free Pi in the cells.

#### **CHLOROPHYLL CONTENT AND PHOTOSYNTHETIC ACTIVITY ARE SIMILAR BETWEEN WT AND OE7 PLANTS**

Regardless of sucrose supplementation, chlorophyll content per seedling did not differ significantly between WT and OE7 (**Figure 6B**). We also measured the photosynthetic activity (relative to chlorophyll content) in WT and OE7 plants grown on sucrose-supplemented medium (**Table 1**). No significant difference was observed between WT and OE7, suggesting that the enhanced growth of OE7 under sucrose supplementation was not due to higher photosynthetic activity compared with WT.

#### **SUCROSE CONTENT IN SHOOTS IS LOWER IN OE7 THAN IN WT UNDER SUCROSE SUPPLEMENTATION**

We observed enhanced growth of OE7 plants only when sucrose was supplied in the growth medium (**Figure 4A**). Sucrose is a major mobile form of photoassimilates, but there were no significant differences in photosynthetic activity between WT and

**Table 1 | Chlorophyll fluorescence parameters for WT and OE7.**


*Plants were grown on ½MS agar supplemented with 1% (w/v) sucrose for 2 weeks. Chlorophyll fluorescence parameters represent the mean* ± *SD of three replicates. qP, photochemical quenching; NPQ, non-photochemical quenching.*

OE7 (**Table 1**). Thus, we also analyzed expression levels of genes involved in cell cycle and trehalose-6-phposphate metabolism (**Figure 7**). Trehalose-6-phosphate metabolism and its content in plants are known to be related to growth enhancement under sucrose-supplemented conditions (Zhang et al., 2009; Debast et al., 2011; Delatte et al., 2011; Martínez-Barajas et al., 2011). However, expression levels of genes were comparable between WT and OE7 (**Figures 7A,B**). Thus, to clarify whether the growth difference was due to the uptake efficiency of the exogenously supplied sucrose, we first measured the sucrose concentration in shoots and roots of WT and OE7 in the absence or presence of sucrose (**Figure 8**). In both shoots and roots, the sucrose concentration in plants grown without exogenous sucrose was comparable between WT and OE7 (**Figure 8**). When sucrose was supplied, its concentration—especially in shoots—of WT and OE7 was higher than that in plants grown without sucrose (**Figure 8A**). In shoots of OE7 and WT grown with sucrose, sucrose concentration was 1.7- and 2.2-fold higher, respectively, compared with OE7 and WT grown without sucrose (**Figure 8A**). As a result, the sucrose concentration in shoots of OE7 was ∼26% lower than that of WT only under sucrose supplementation. Sucrose supplementation did not significantly affect the sucrose content in roots (**Figure 8B**).

#### **GALACTOLIPID SYNTHESIS USING EXOGENOUSLY SUPPLIED SUCROSE IS ENHANCED IN OE7 PLANTS**

To clarify the reason for the observed decrease in sucrose concentration in shoots of OE7 plants under sucrose supplementation, we measured the uptake of [14C]sucrose. Unexpectedly, the levels of labeled seedlings did not differ significantly between WT and OE7, suggesting that the efficiency of sucrose uptake did not differ between WT and OE7 (**Figure 9A**). However, the relative amount of 14C incorporation in each membrane lipid clearly indicated that sucrose absorbed from roots was immediately utilized as a carbon source for membrane lipid synthesis, and the ratio of 14C level in DGDG relative to all labeled lipids in OE7 was slightly higher than that in WT (**Figure 9B**). These results suggested that sucrose uptake activity of OE7 is similar to that of WT, but the galactolipid biosynthesis using exogenously supplied sucrose as a carbon source was enhanced in OE7 compared with WT.

# **DISCUSSION**

When sucrose was exogenously supplied to the growth medium, the expression of *MGD2* and *MGD3* as well as *SUC2*, *IPS1*, and *At4* was upregulated in shoots (**Figure 1A**). However, expression levels of three genes responsive to Pi deficiency also involved in lipid synthesis (*DGD1*, *DGD2*, and *NPC5*) were not significantly changed in shoots (**Figure 1A**). Our result agrees with previous studies reporting that the change in gene expression induced by sucrose supplementation is similar to that induced by Pi deficiency (Hammond et al., 2003; Vance et al., 2003; Wu et al., 2003; Misson et al., 2005; Müller et al., 2005, 2007; Hammond and White, 2008). However, this similarity is not a common feature of all membrane-remodeling genes responsive to Pi deficiency. The *Arabidopsis hsp1* mutant, in which sucrose content is higher than in WT in both shoots and roots, shows a hypersensitive phenotype in response to Pi starvation, suggesting that sucrose is a global regulator of plant responses to Pi starvation (Lei et al., 2011). Indeed, microarray analysis of the *hsp1* mutant has revealed the induction of ∼70% of the Pi starvation– responsive genes in WT (Lei et al., 2011). *MGD3* is one of the top 20 genes that are synergistically induced by Pi starvation in the *hsp1* mutant, although expression of *DGD1* and *DGD2* is comparable between WT and *hsp1* and *NPC5* is 5-fold lower in *hsp1* than in WT (Lei et al., 2011). Under Pi depletion, a simultaneous increase in expression levels of these genes occurs in shoots, which is essential for membrane lipid remodeling (Benning and

Ohta, 2005; Nakamura, 2013). Thus, although the time course of gene expression should be further analyzed, the responses to exogenously supplied sucrose and Pi starvation do not appear to be fully correlated.

Growth of transgenic plants that overexpress *MGD3* was enhanced compared with WT, whereas growth of the *MGD3* knockout mutant *mgd3* was slower than WT under sucrose supplementation (**Figures 2**, **4A,B**). Unexpectedly, although *MGD3* expression was significantly higher in OE7 than in WT (**Figure 3A**), the DGDG molar ratio in membrane lipids was only slightly different between OE7 and WT (**Figure 5**). Indeed, the level of *MGD3* mRNA is markedly higher in OE7 than in WT plants grown under Pi depletion (Narise et al., 2010). In shoots of WT grown under Pi depletion, the molar ratio of DGDG in the total membrane lipids is ∼15% higher than in plants grown under Pi-sufficient conditions (Kobayashi et al., 2009a). The effect of *MGD3* OE on membrane lipid composition may have been smaller than we expected because of the lack of coactivation of genes/enzymes involved in DGDG synthesis. When DGDG synthesis is not activated, MGDG may accumulate in the membrane. However, MGDG is not a bilayer-forming lipid in the membrane, and thus MGDG hyperaccumulation might be cytotoxic (Murphy, 1986). Indeed, rough cell membrane surfaces and defects in cell division are observed in *Escherichia coli* that accumulate MGDG (Gad et al., 2001). Our results suggest that a feedback mechanism or other unknown mechanisms might exist to prevent hyperaccumulation of MGDG in membranes.

Inorganic Pi content in plant cells (**Figure 6A**), photosynthesis (**Table 1**), and cell division–related gene expression (**Figure 7A**) were not significantly different between WT and OE7 plants. We also measured the sucrose content in shoots and found that OE7 contained less sucrose than WT (**Figure 8**). Given that sucrose uptake by OE7 and WT plants was similar (**Figure 9**), upregulation of galactolipid synthesis mediated by type-B MGDG synthases in the outer envelope membrane of chloroplasts may correlate with accelerated sugar utilization as a carbon source only when the carbon source is supplied exogenously.

When carbon availability is elevated by sucrose supplementation, trehalose 6-phosphate (T6P) content increases (Schluepmann et al., 2004; Lunn et al., 2006). T6P inhibits the catalytic activity of the SNF1-related protein kinase SnRK1 in growing tissues of plants (Zhang et al., 2009; Debast et al., 2011; Delatte et al., 2011; Martínez-Barajas et al., 2011). Inhibition of SnRK1 blocks expression of more than 1000 genes involved in biosynthesis, growth, and stress responses (Baena-González et al., 2007; Nunes et al., 2013). *MGD3* is not included among the genes regulated by SnRK1, suggesting that its transcriptional regulation by exogenously supplied sucrose is distinct from the SnRK1–related regulation mechanism. Moreover, expression levels of several genes involved in T6P metabolism and the SnRK1 mediated signaling pathway were comparable between WT and OE7 (**Figure 7B**), showing that there was no correlation between upregulation of MGDG synthesis and SnRK1-mediated stress responses. Thus, upregulation of MGD3 appears to be involved in sucrose metabolism and growth enhancement under sucrose

supplementation in a manner different from that mediated by T6P and SnRK1.

Recently, it was suggested that balance between available Pi and carbon content might be important for the response to Pi starvation (Lei and Liu, 2011). Given that *MGD3* expression levels were increased by Pi starvation and sucrose supplementation (**Figure 1**) and that sucrose content was lower in shoot tissues of OE7 than of WT when sucrose was supplied (**Figure 8A**), galactolipid synthesis on the outer envelope membrane of chloroplasts might play the following two roles: (1) maintenance of the ratio of available Pi and carbon in plant tissues by reducing the cellular sucrose content via galactolipid synthesis in Pi-depleted growth medium, (2) supply of DGDG as a component of the plasma membrane to support enhanced growth under sucrose supplementation. Regardless of which scenario is correct, future work will be needed to confirm a new role for MGD3 other than for galactolipid supply during lipid remodeling under Pi depletion.

# **MATERIALS AND METHODS**

#### **PLANT MATERIAL AND GROWTH CONDITIONS**

Seedlings of WT *A. thaliana* (Columbia-0), *mgd3* mutant and transformants overexpressing *At*MGD3-GFP protein were grown on Murashige and Skoog (MS) medium (Murashige and Skoog, 1962) solidified with 0.8% (w/v) agar containing 1% (w/v) sucrose or 0.53% (w/v) mannitol, for the osmotic control, at 23◦C under continuous white light.

MGD3 OE transformants were produced by a modified version of the vacuum-infiltration method (Bechtold and Pelletier, 1998) using pBI121 in which the *AtMGD3* cDNA sequence and *GFP* tag were inserted under control of the 35S-CaMV promoter, and transformants were then selected on MS agar containing 50μg·mL−<sup>1</sup> kanamycin.

#### **QUANTITATIVE RT-PCR**

Total RNA was extracted from plant shoots and roots using the SV Total RNA Isolation System (Promega). Reverse transcription (RT) was performed using the PrimeScript RT reagent kit (Takara). PCR was conducted using the SYBR Premix Ex Taq II (Takara), and signals were detected/quantified using the Thermal Cycler Dice Real Time System (Takara). Quantitative RT-PCR was carried out as following method. Each PCR reaction mixture (25μL) was prepared to contain cDNA (RT product from 6 ng of RNA), 10μL SYBR Premix Ex Taq II and 0.4μM of each primer. Samples were run for 40 cycles under the following thermal cycling protocol and analyzed the dissociation curve: preheating step at 95◦C for 30 s, 40 amplification cycles of 95◦C for 5 s, 60◦C for 30 s, 1 cycle of 95◦C for 15 s, 60◦C for 30 s, 95◦C 15 s. Quantitative RT-PCR was carried out using *AtUBQ10* (At4g05320) as an internal standard. The following primers were used:

AtMGD3\_Fw: 5 TCGTGGCGGATTGGTTTAG 3 AtMGD3\_Rv: 5 CGTTGTTGTTGTTGGGATAGATG 3 AtMGD2\_Fw: 5 GATTCGATCACTTCCTATCATCCTC 3 AtMGD2\_Rv: 5 TGTGCTAAACCATTCCCCAAC 3 AtDGD1\_Fw: 5 CTGAAGAGAGATCCCGTGGTG 3 AtDGD1\_Rv: 5 TCCCAAGTTCGCTTTTGTGTT 3 AtDGD2\_Fw: 5 TGCAGAACCTATGACGATGGA 3 AtDGD2\_Rv: 5 GCTCTGTAAGTTGCGATGGTTG 3 AtNPC5\_Fw: 5 TTCTTCATCTCCCCTTGGATTG 3 AtNPC5\_Rv: 5 GTGACATTAGGTACGGCCCATT 3 AtSUC2\_Fw: 5 TCCCTTTCCTTCTCTTCGACAC 3 AtSUC2\_Rv: 5 CATAAGCCCCAAAGCACCA 3 AtIPS1\_Fw: 5 AGACTGCAGAAGGCTGATTCAGA 3 AtIPS1\_Rv: 5 TTGCCCAATTTCTAGAGGGAGA 3 At4\_Fw: 5 CTGAAGCTCAAGAACCCTCTGAA 3 At4\_Rv: 5 CCTCTCAAAACCCTTTATTGGTGA 3 AtMGD1\_Fw: 5 AGGTTTCACTGCGATAAAGTGGTT 3 AtMGD1\_Rv: 5 AACGGCAATCCCTCCTCAC 3 AtCYCD2;1\_Fw: 5 GCTGCTGCAGTGTCTGTTTC 3 AtCYCD2;1\_Rv: 5 ACAGCTCTTACCGCAACTCG 3 AtCYCD3\_Fw: 5 CAACTACCAGTGGACCGCATC 3 AtCYCD3\_Rv: 5 AATCACGCAGCTTGGACTGTT 3 AtCDKB2\_Fw: 5 CCAATGAAGAAGTATACCCATGAGA 3 AtCDKB2\_Rv: 5 AATGGGTGGCACCAAGAAG 3

AtCDKA1\_Fw: 5 CCGAGCACCAGAGATACTCC 3 AtCDKA1\_Rv: 5 GTTACCCCACGCCATGTATC 3 AtTPS5\_Fw: 5 TCTCGGTTTGGGTGCAGAGCA 3 AtTPS5\_Rv: 5 ACCAAACTCGACGTTTCCCAGTCT 3 AtTPS1\_Fw: 5 ACCATAGTTGTTCTGAGCGGAAGCA 3 AtTPS1\_Rv: 5 TCATCCACTCTCCATTCGTAAGCCT 3 AtTPPB\_Fw: 5 GGGACAAGGGCCAGGCACTC 3 AtTPPB\_Rv: 5 ACACCGGCACAACATCATCCGA 3 AtAKIN11\_Fw: 5 CACCATTCCTGAGATCCGTCA 3 AtAKIN11\_Rv: 5 GAGACAGCAAGATAACGAGGGAG 3 AtUBQ10\_Fw: 5 CCCTAACGGGAAAGACGATTAC 3 AtUBQ10\_Rv: 5 AAGAGTTCTGCCATCCTCCAAC 3

Sequences of AtTPS5\_Fw and Rv, AtTPS1\_Fw and Rv, AtTPPB\_Fw and Rv, and AtAKIN11\_Fw and Rv were described by Nunes et al. (2013). Sequences of AtMGD2\_Fw and Rv, AtIPS1\_Fw and Rv, and At4\_Fw and Rv were described by Narise et al. (2010). Sequences of AtCYCD2;1\_Fw and Rv, CDKA1\_Fw and Rv were described by Sanz et al. (2011).

#### **FRESH WEIGHT MEASUREMENT**

Fresh weight of shoots or roots from 3 to 5 plants were measured together, and the average weight of an individual plant was calculated. The mean ± SE was calculated furthermore.

#### **WESTERN BLOT ANALYSIS**

For western blotting, WT and OE plant samples were homogenized in 50 mM Tris-HCl, pH 7.5 and centrifuged at 3,000 × g to remove tissue debris, and each supernatant was used as crude extract. Of the crude protein (20μg from shoots, 10μg from roots) was subjected to SDS-PAGE (12.5% polyacrylamide), blotted onto a nitrocellulose membrane (Whatman), and incubated for 3 h at 23◦C with monoclonal anti-GFP (Clontech**;** diluted 1:5,000) and then with horseradish peroxidase–conjugated antimouse IgG secondary antibody (Thermo Scientific; diluted 1:100). Bands were detected by chemiluminescence substrates (SuperSignal West Femto Chemiluminescent Substrate, Thermo Scientific) and film (Hyperfilm ECL, GE Healthcare).

To further assess subcellular localization, crude-extract proteins were centrifuged at 125,000 × g to yield soluble and microsomal membrane fractions. These fractions were then subjected to SDS-PAGE/western blotting as described above, and protein bands were detected using Image Quant LAS 500 (GE Healthcare).

For chloroplast purification, plants were homogized by a blender in lysis buffer (50 mM HEPES-KOH, 330 mM sorbitol, 2.0 mM EDTA, 1.0 mM MgCl2, 1.0 mM MnCl2, pH7.8) with Protease Inhibitor Cocktail (Roche; complete Mini), and centrifuged at 2,000 × g for 5 min to enrich chloroplasts as the pellet. Resuspended pellet and supernatant fractions were then subjected to SDS-PAGE/western blotting as described above, or for the LHCB6 detection, membranes were incubated for 3 h at 23◦C with monoclonal anti-LHCB6 (Agrisera**;** diluted 1:5,000) and then with peroxidase anti-rabbit IgG secondary antibody (Vector Laboratories; diluted 1:10,000). Protein bands were detected using Image Quant LAS 500 (GE Healthcare).

#### **MEASUREMENT OF GALACTOLIPID SYNTHASE ACTIVITY**

Plant microsomal fractions of WT and OE7 were obtained by centrifugation of homogenized plant shoots at 3,000 × g (supernatant) and 125,000 × g (pellet). Galactolipid synthetic activities were measured using 14C-labeled UDP-galactose as a substrate according to our previous reports with minor modifications (Yamaryo et al., 2003; Shimojima et al., 2013). Briefly, after pre-incubation of microsomal enzyme in 190μL of assay mixture [6.4 mM dioleoylglycerol in 0.01% (w/v) Tween 20, 10 mM dithiothreitol, 10 mM sodium acetate, and 18 mM MOPS-KOH, pH 7.8] at 30◦C for 5 min, 10μL of 14C-labeled UDP-galactose (8.08 mM, 91.6 Bq nmol<sup>−</sup>1) was added to start the reaction. The reaction products were extracted in ethyl acetate, separated by thin layer chromatography (solvent system, acetone: toluene: water = 136: 45: 13, v/v/v), and quantified using a fluoro-image analyzer (FLA-7000, Fujifilm).

#### **LIPID ANALYSIS**

Total lipid was extracted according to Bligh and Dyer (1959). The polar membrane lipids were separated by two-dimensional silica gel thin-layer chromatography (Kobayashi et al., 2007). Separated lipids were then subjected to hydrolysis and methylation, and fatty acid methyl esters were quantified by gas chromatography using pentadecanoic acid as an internal standard (Kobayashi et al., 2006). Microsomal membranes were obtained by a couple of centrifugation steps (supernatant at 3,000 × g and pellet at 125,000 × g) after homogenization of plant shoots and re-suspended in the buffer (50 mM HEPES-KOH, pH7.8). Microsomal membrane lipids were extract by mixing with 10-fold volume of chloroform: methanol (2: 1, v/v), washed twice with same volume of 0.45% NaCl, and concentrated in chloroform: methanol (2: 1, v/v).

#### **PI MEASUREMENT**

Inorganic Pi was extracted separately from shoots and roots, and Pi content was measured using a phosphomolybdate colorimetric assay as described by Chiou et al. (2006). Samples were homogenized with extraction buffer (10 mM Tris, 1.0 mM EDTA, 100 mM NaCl, and 1.0 mM β-mercaptoethanol, pH 8.0). After centrifugation (12,000 × g) for 10 min, 100μL of supernatant was mixed with 900μL of 1% glacial acetic acid and incubated at 42◦C for 30 min. After centrifugation (120,000 × g) for 5 min, 300μL of supernatant was mixed with 700μL of assay solution [0.35% w/v NH4MoO4, 0.43 M H2SO4, and 1.4% (w/v) ascorbic acid] and then incubated at 42◦C for 30 min. The Pi content was measured at A820.

#### **MEASUREMENT OF PHOTOSYNTHETIC ACTIVITY**

Chlorophyll fluorescence parameters were measured using a Dual-PAM system (Walz). The minimum chlorophyll fluorescence at the open PSII center (Fo) was detected by measuring light (655 nm) at an intensity of 0.05–0.15μmol m−<sup>2</sup> · *s* −1. A saturating pulse of white light (800 ms) was applied to determine the maximum chlorophyll fluorescence at closed PSII centers in the dark (Fm) and during actinic light illumination (Fm ). Steadystate chlorophyll fluorescence (Fs) was recorded during actinic light illumination (80μmol photons m−<sup>2</sup> · *s* <sup>−</sup>1) and Fo as the minimum chlorophyll fluorescence when actinic light was turned off. *Fv/Fm* was calculated as (Fm - Fo)/Fm. qP was calculated as (Fm - Fs)/(Fm - Fo ). NPQ was calculated as (Fm − Fm )/Fm . -II was calculated as (Fm − Fs)/Fm.

#### **MEASUREMENT OF CHLOROPHYLL CONTENT**

Total plant chlorophyll was extracted from homogenized plants using 80% (v/v) acetone. Samples were centrifuged at 12,000 × g at 4◦C for 5 min, and then the supernatant was used to measure the absorbance with a spectrophotometer (UV-1600, Shimadzu). Total chlorophyll was calculated using the following formula (Porra et al., 1989):

Total chlorophyll (nmol/mL) = 19.54 <sup>∗</sup> (A646.<sup>8</sup> − A720) + 8.29 <sup>∗</sup> (A663.<sup>2</sup> − A720).

#### **MEASUREMENT OF SUCROSE CONTENT**

Plant seedlings were cut into shoot and root parts and frozen in liquid nitrogen. Soluble sugars were extracted twice in 80% (v/v) ethanol at 80◦C for 10 min. Samples were centrifuged at 2,500 × g for 10 min and then dried under N2 gas. Glucose content was estimated by using the Glucose Colorimetric/Fluorometric Assay kit (BioVision). To calculate sucrose content, samples were incubated with 50% (v/v) Invertase Solution from Yeast (Wako) at 25◦C for 1 h, and the resultant glucose content was estimated as described above.

#### **MEASUREMENT OF SUCROSE UPTAKE**

Two-week-old plant seedlings grown on MS medium without sucrose or mannitol were used to estimate sucrose uptake as described by Lei et al. (2011). Briefly, after incubation in MS medium (pH 5.7) for 30 min, the roots were incubated in MS medium containing 0.1% sucrose and [14C]sucrose (0.5 mCi·mL<sup>−</sup>1) and incubated for 2 h. After two washes with MS medium containing 1% sucrose, the 14C in each seedling was measured in a scintillation counter (LS6500, Beckman) and expressed as cpm·mg−<sup>1</sup> fresh weight. Each lipid fraction was subjected to thin-layer chromatography (solvent system, acetone: toluene: water = 136: 45: 13), and a radioactive intensity ratio was measured for each fraction using an fluoro-image analyzer (FLA-7000, Fujifilm).

# **ACKNOWLEDGMENTS**

This work was supported by a Grant-in-Aid for Scientific Research on Innovative Areas (No. 23119506, 25119708), the Global Center of Excellence Program, from the Earth to "Earths," at the Tokyo Institute of Technology and The University of Tokyo, and JST CREST from the Ministry of Education, Sports, Science and Culture in Japan.

#### **REFERENCES**


membranes from spinach chloroplasts. II. Biochemical characterization. *J. Biol. Chem.* 258, 13281–13286.


sugar signaling in Arabidopsis. *Planta* 225, 907–918. doi: 10.1007/s00425-006- 0408-8


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 March 2014; accepted: 28 May 2014; published online: 23 June 2014.*

*Citation: Murakawa M, Shimojima M, Shimomura Y, Kobayashi K, Awai K and Ohta H (2014) Monogalactosyldiacylglycerol synthesis in the outer envelope membrane of chloroplasts is required for enhanced growth under sucrose supplementation. Front. Plant Sci. 5:280. doi: 10.3389/fpls.2014.00280*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Murakawa, Shimojima, Shimomura, Kobayashi, Awai and Ohta. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The selective biotin tagging and thermolysin proteolysis of chloroplast outer envelope proteins reveals information on protein topology and association into complexes

#### *Hélène Hardré1, Lauriane Kuhn2, Catherine Albrieux1, Juliette Jouhet 1, Morgane Michaud1, Daphné Seigneurin-Berny1, Denis Falconet 1, Maryse A. Block1 and Eric Maréchal <sup>1</sup> \**

*<sup>1</sup> Laboratoire de Physiologie Cellulaire et Végétale, UMR 5168 CNRS-CEA-INRA-Université Grenoble Alpes, iRTSV, CEA Grenoble, Grenoble, France <sup>2</sup> Laboratoire de Biologie à Grande Echelle, iRTSV, CEA Grenoble, Grenoble, France*

#### *Edited by:*

*Kentaro Inoue, University of California at Davis, USA*

#### *Reviewed by:*

*Franck Anicet Ditengou, University of Freiburg, Germany Bettina Bölter, Ludwig-Maximilians-Universität München, Germany Agostinho Gomes Rocha, University of California at Davis, USA*

#### *\*Correspondence:*

*Eric Maréchal, Laboratoire de Physiologie Cellulaire et Végétale, UMR 5168 CNRS-CEA-INRA-Université Grenoble Alpes, iRTSV, CEA Grenoble, 17 rue des Martyrs, F-38054, Grenoble 09, France e-mail: eric.marechal@cea.fr*

The understanding of chloroplast function requires the precise localization of proteins in each of its sub-compartments. High-sensitivity mass spectrometry has allowed the inventory of proteins in thylakoid, stroma, and envelope fractions. Concerning membrane association, proteins can be either integral or peripheral or even soluble proteins bound transiently to a membrane complex. We sought a method providing information at the surface of the outer envelope membrane (OEM), based on specific tagging with biotin or proteolysis using thermolysin, a non-membrane permeable protease. To evaluate this method, envelope, thylakoid, and stroma proteins were separated by two-dimensional electrophoresis and analyzed by immunostaining and mass spectrometry. A short selection of proteins associated to the chloroplast envelope fraction was checked after superficial treatments of intact chloroplasts. We showed that this method could allow the characterization of OEM embedded proteins facing the cytosol, as well as peripheral and soluble proteins associated *via* tight or lose interactions. Some stromal proteins were associated with biotinylated spots and analyzes are still needed to determine whether polypeptides were tagged prior import or if they co-migrated with OEM proteins. This method also suggests that some proteins associated with the inner envelope membrane (IEM) might need the integrity of a trans-envelope (IEM–OEM) protein complex (e.g., division ring-forming components) or at least an intact OEM partner. Following this evaluation, proteomic analyzes should be refined and the putative role of inter-membrane space components stabilizing trans-envelope complexes demonstrated. For future comprehensive studies, perspectives include the dynamic analyses of OEM proteins and IEM–OEM complexes in various physiological contexts and using virtually any other purified membrane organelle.

**Keywords: chloroplast envelope, outer envelope protein, membrane protein complexes, biotinylation, thermolysin**

#### **INTRODUCTION**

The chloroplast is a specific organelle inside plant and algal cells, involved in numerous biochemical functions, including photosynthesis. This organelle derives from an ancestral cyanobacterium following an endosymbiotic event. Its membrane compartmentation is undoubtedly one of the most complex found in eukaryotic cells. In higher plants, chloroplasts are delineated by an envelope made of two membranes, the outer and the inner envelope membrane, OEM and IEM, respectively. In the stroma, additional membrane sacks, the thylakoids, provide an extensive surface for light capture and photosynthetic energy conversion. Liquid phase compartments comprise the intermembrane space between the OEM and IEM, the stroma, and the lumen of thylakoids. Understanding the mechanisms that orchestrate the chloroplast biogenesis, differentiation and function, requires a fine sub-organellar localization of the chloroplast components. The characterization of cytosolic components associated with the envelope surface is further needed to understand the functional integration of the organelle within the rest of the cell.

The vast majority of plastid proteins are encoded by nuclear genes. Protein precursors are synthesized within the cytosol with an *N*-terminal chloroplastic transit peptide (Ctp). For major chloroplast precursors (small subunit of the Rubisco, the light harvesting complex β, subunits of the oxygen evolving complex and the ATPase γ -subunit), the Ctp is phosphorylated in the cytosol, stimulating the association of a hetero-oligomeric complex, the guidance complex, with a 14-3-3 dimer, a cytosolic heat shock protein HSP70/Com70, and possibly several other components (Pontier et al., 2007; Li and Chiu, 2010; Lee et al., 2013). The Ctp subsequently binds to the general import machinery, the TOC/TIC translocon, which directs precursors across the envelope membranes (Li and Chiu, 2010). After import, the Ctp is cleaved by a stromal processing peptidase (for review, Li and Chiu, 2010). This process is sufficient to address the majority of stromal and IEM proteins. Nevertheless, some Ctp-less precursors were shown to be addressed to the IEM (Tranel and Keegstra, 1996; Miras et al., 2002, 2007). With the noticeable exception of Toc75 harboring a bipartite transit peptide (Tranel et al., 1995; Tranel and Keegstra, 1996), most OEM proteins have no cleavable addressing signal and follow independent import systems (Li and Chiu, 2010). Thus, although some general processes appear to target the bulk of proteins inside chloroplast sub-compartments, none is universal, and even though Ctp features are useful for computational predictive methods, no bioinformatic tool allows a prediction of a protein targeting to the OEM with a high confidence level (Schleiff et al., 2003). Accuracy of predictions being imperfect, assessment of sub-organellar proteomes had to be determined experimentally.

Proteomic analyzes have clarified this point with an unprecedented level of precision. The strategy consisted in purifying intact chloroplasts, fractionate sub-organellar compartments (envelope, stroma, thylakoids), resolve protein sub-fractions by 1-D or 2-D PAGE and identify individual proteins based on mass spectrometry analyses. This was performed first using spinach chloroplast leaves as a convenient starting material yielding high amounts of pure chloroplasts and, with improvement of analytical sensitivity, with *Arabidopsis* leaves, benefiting of all the genomic information made available for this model. A difficulty was that 2D-PAGE resolution was unsuccessful in the recovery of trans-membrane proteins and the yield of less hydrophobic proteins decreased with loaded protein amounts. The main constraint limiting the analysis of integral proteins is due to hydrophobic polypeptides, which cannot be resolved by isoelectric focusing (IEF) and electrophoresis, even under stringent denaturing conditions. To circumvent this problem, the most hydrophobic envelope proteins were selectively extracted using organic solvents. The resulting extract could be resolved by 1-D SDS-PAGE, and proteins identified by mass spectrometry methods (Seigneurin-Berny et al., 1999; Ferro et al., 2000; Rolland et al., 2003). Owing to the selective solubility fractionation, integral envelope proteins were inventoried for the first time and miss-localizations corrected. Having performed and validated the subcellular localization of envelope protein markers by immunostaining and chloroplast visualization of GFP-protein fusions, 2D-PAGE based proteomic studies have served as references for comprehensive and highly sensitive proteome characterization of envelope sub-comparments in *Arabidopsis* (Rolland et al., 2006; Salvi et al., 2008; Joyard et al., 2010), which results have been made accessible via the AT\_CHLORO database (Ferro et al., 2010) (http://www.grenoble.prabi.fr/at\_chloro/).

A major challenge is to get access to topological information (what is inward and outward a given membrane) and how protein complexes get associated. In the present paper, we proceeded stepwise. For the convenience of pure organelle pre-treatments, we used spinach chloroplasts as a model. We used conditions allowing the resolution of envelope-associated proteins by 2-D PAGE, a procedure also efficient for the resolution of thylakoid and stroma proteins. Because no gold standard has been defined for 2-D PAGE analysis of chloroplast envelope proteins, the quality of the 2-D profiles we obtained was assessed by immunostaining with antibodies raised against envelope, thylakoid and stroma protein markers. Further protein identification was based on mass spectrometry analyses. Proteins exposed at the surface of chloroplasts were sought by complementary methods, i.e., selective superficial biotin-tagging and thermolysin-proteolysis. In addition to known envelope membrane proteins, our analyses of intact or tagged/shaved envelope membranes, showed that *soluble* proteins were also detected at the periphery of the envelope, sometimes associated to stable transenvelope complexes. From this characterization, several processes were shown to be structurally associated to the envelope: a channeling of stromal protein maturation, a dynamic assembly and structural stability of some stromal and trans-envelope complexes and several important steps of stromal RNA editing. Our work introduces therefore a "tag and shave" strategy as a possible approach to characterize peripheral membrane proteins of a membrane bound organelle, bringing topological clues concerning the sub-organellar localization of proteins and their possible involvement in large functional complexes connecting sub-compartments. This technical development was performed on spinach leaves so as to provide abundant organellar material, and future directions using more accurate plant models are discussed.

### **MATERIALS AND METHODS**

#### **ISOLATION OF PURIFIED INTACT SPINACH CHLOROPLAST AND PREPARATION OF SPINACH CHLOROPLAST SUBFRACTIONS**

All operations were carried out at 0–5◦C. Spinach leaves were obtained freshly from the market and kept overnight in the dark at 4◦C so as to reduce the starch content (Joyard et al., 1982). Crude chloroplasts were isolated from 3 kg of spinach (*Spinacia oleracea* L.) leaves. Envelope, stroma, and thylakoid subfractions from the chloroplasts were purified as described previously (Joyard et al., 1982). All manipulations were performed at 4◦C. In brief, deveined spinach leaves were homogenized in 2 L of sucrose 0.33 M, Na-pyrophosphate 30 mM, Bovine serum albumin 1 g·L<sup>−</sup>1, pH 7.8, for 2 s in a 4-L Waring Blendor and a crude chloroplast pellet was obtained from the leaf homogenate. The pellet was washed in sucrose 0.33 M, MOPS 10 mM, pH 7.8. To avoid contamination by other membrane organelles and swollen thylakoid membranes, the chloroplast preparation was purified further by isopycnic centrifugation on a Percoll (Pharmacia) gradient (40% Percoll, 50 mL and 80% Percoll, 20 mL in washing buffer; 5000 *g*, 20 min). Intact chloroplasts were collected at the interface between the 40 and 80% Percoll cushions. At this stage, thermolysin treatment or biotin tagging was performed as described below. Envelope, thylakoids, and stroma were prepared from purified, intact chloroplasts after swelling in a hypotonic medium (MOPS 10 mM, MgCl2 4 mM, EDTA 5 mM, pH 7.8 in presence of protease inhibitors) followed by ultra-centrifugation through a step sucrose gradient (sucrose 0.6 M, 10 mL and 0.93 mM, 12 mL in MOPS 10 mM, MgCl2 4 mM, EDTA 5 mM, pH 7.8 in presence of protease inhibitors; 72,000 *g*, 1 h). The swelling medium as well as the different sucrose layers contained the following protease inhibitors: EDTA, 5 mM, phenylmethylsulfonylfluoride, 1 mM; E-aminocaproic acid, 5 mM; and benzamidine-HCI, 1 mM. The yield of envelope membranes was 2–3 mg of protein/kg of spinach leaves.

#### **PROTEASE TREATMENT OF ISOLATED INTACT SPINACH CHLOROPLAST**

To study polypeptides localized on the external face of the OEM, intact spinach chloroplast were treated with thermolysin from *Bacillus thermoproteolyticus* (Boehringer Mannheim, Germany). Protease treatments were carried out on ice, under light conditions and using intact chloroplasts at 1 mg·mL−<sup>1</sup> of chlorophyll in buffer T containing 100µM thermolysin, 0.33 M saccharose, 20 mM MOPS pH 7.8, 1 mM CaCl2. The reaction was terminated after 1 h with EGTA (10 mM). Treated chloroplasts were layered on a Percoll gradient as described above, in presence of EDTA (10 mM) and centrifuged at 5000 *g* for 20 min to obtain intact plastids. Intact treated chloroplasts were used to purify envelope, stroma, and thylakoid sub-fractions as described above (Joyard et al., 1982), in presence of EDTA (5 mM) and of a cocktail of protease inhibitors. Samples that were not treated with protease (mock) went through the same procedure except that buffer T was deprived of thermolysin.

#### **BIOTINYLATION OF ISOLATED INTACT SPINACH CHLOROPLAST**

To study polypeptides of the OEM, intact spinach chloroplasts were superficially labeled with the hydrophobic biotinylation reagent 6-((6-((biotinoyl)amino)hexanoyl)amino)hexanoic acid, succinimidyl ester (biotin-XX,SE; Molecular Probes). Biotinylation reaction was performed during 15 min at 4◦C. Chlorophyll content of intact chloroplasts was measured as described (Arnon, 1949). Chloroplast labeling reactions were carried out on ice using intact chloroplasts at 5 mg·mL−<sup>1</sup> of chlorophyll in buffer B containing 20µM of biotin-XX,SE, 0.33 M saccharose, 50 mM sodium bicarbonate pH 8.3, 0.1% (v/v) Dimethyl Sulfoxide (DMSO). The biotinylation reaction was terminated after 15 min with hydroxylamine 5 mM. Biotinylated chloroplasts were layered on a Percoll gradient in 0.33 M saccharose, 50 mM MOPS pH 7.8, and centrifuged at 5000 *g* for 20 min to obtain intact plastids as described above. intact treated chloroplasts were suspended in 0.33 M saccharose, 50 mM MOPS pH 7.8, in presence of EDTA (5 mM) and protease inhibitors and envelope, stroma, and thylakoid sub-fractions were purified as described above. Samples that were not biotinylated (mock) went through the same procedure except that buffer B was deprived of biotin-XX,SE.

#### **ONE-DIMENSIONAL POLYACRYLAMIDE GEL ELECTROPHORESIS (1-D SDS-PAGE)**

Proteins prepared from intact or biotinylated spinach chloroplast sub-fractions (20µg proteins, determined using the BCA protein assay kit, BioVision, and bovine serum albumin as a standard) were separated by SDS-PAGE (11% polyacrylamide gel) according to standard procedures. Separated proteins were either electro-transferred to nitrocellulose membranes or stained with Coomassie brilliant blue G- 250.

#### **TWO-DIMENSIONAL POLYACRYLAMIDE GEL ELECTROPHORESIS (2-D PAGE)**

All reagents and materials were obtained from Bio-Rad unless indicated. Polypeptides of the chloroplast sub-fractions (stroma, thylakoid, and envelope membranes) were analyzed by 2-D PAGE. Each sub-fraction (200µg proteins) was solubilized in 250µl of a rehydratation buffer containing 8 M urea, 2 M thiourea, 4% (w/v) CHAPS, 100 mM dithiothreitol (DTT), and 0.2% (v/v) Bio-lytes (0.1% of pH 4–6 + 0.1% of pH 5–7 or 0.2% of pH 3–10). After 30-min incubation at 20◦C, the solubilized proteins were used for passive hydration of linear immobilized pH gradient (IPG) strips (7 cm; pH 4–7 or pH 3–10). Then, the 7 cm IPG strips were subjected to the following IEF program using a Bio-Rad Protean IEF System: constant voltage at 50 V for 4 h; constant voltage at 250 V for 2 h; linear increase from 250 to 4000 V over 9 h and constant voltage set at 4000 V for a total of 25 kV·h. The current was limited (50µA per strip), and the running temperature was set at 20◦C. The strips were stored at −20◦C until used for second dimension. To solubilize proteins focused during the first dimension run, IPG strips were equilibrated for 20 min in equilibration buffer 1 containing 6 M urea, 20% (w/v) glycerol, 2% (w/v) SDS, 375 mM Tris-HCl, pH 8.8, 130 mM DTT, and for 40 min in equilibration buffer 2 containing 6 M urea, 20% (w/v) glycerol, 2% (v/v) SDS, 375 mM Tris-HCl, pH 8.8, 135 mM iodoacetamide. After equilibration, IPG strips were loaded on top of a 13% acrylamide gel and fixed with molten agarose solution to ensure good contact between gel and strip. Twelve gels were cast under identical conditions within multi-casting chambers. A BioRad Dodeca cell was used to ensure that gels were run under the same electrical conditions. Electrophoresis was performed at 20◦C in the following buffer: 25 mM Tris-HCl, 192 mM Glycine and 2% (v/v) SDS for 1 h at 25 V followed by 2 h at 100 V. Separated proteins were either electro-transferred to nitrocellulose membranes or stained with Coomassie brilliant blue G- 250.

#### **IMMUNOBLOTTING STUDIES OF PROTEINS**

Electro-transfer of proteins separated by SDS-PAGE or 2-D PAGE on nitrocellulose membranes (Hybond-ECL, Amersham, Pharmacia Biotech) was carried out using standard procedures. Western blot analyses were achieved using 300 mM NaCl, 10 mM Tris-HCl, pH 7.5, containing non-fat dried milk (50 g·L<sup>−</sup>1) as blocking buffer. All antibodies were from rabbit sera. Polyclonal antibodies raised against spinach OEP10 and OEP24 [anti-OEP10 and anti-OEP24, (Joyard et al., 1982)], which are specific marker of the OEM, spinach IEP37 [anti-IEP37, (Joyard et al., 1982)], a major IEM polypeptide, spinach MGDG synthase 1 [anti-MGD1, (Awai et al., 2001)], a minor IEM polypeptide, and polyclonal antibodies against the recombinant *Arabidopsis* protein ceQORH [anti-ceQORH, (Miras et al., 2002)], associated with the IEM, were used to analyze envelope fractions. Polyclonal antibodies raised against ketol-acid reductoisomerase [anti-KARI from spinach, (Pontier et al., 2007)], a major polypeptide from the stroma, and the α, β, and γ subunits of the ATP synthase coupling factor 1 [anti-CF1 from spinach, (Pontier et al., 2007)], a major complex from thylakoids, were used to analyze stroma and thylakoid fractions, respectively. Immune complexes were detected using horseradish-peroxidase-conjugated anti-rabbit IgGs, and chemiluminescence visualization (ECL, Amersham Bioscience).

#### **DETECTION OF BIOTINYLATED PROTEINS**

After electro-transfer of proteins separated by SDS-PAGE or 2- D PAGE on nitrocellulose, membranes were incubated overnight in 300 mM NaCl, 30 g·L−<sup>1</sup> bovine serum albumine, 10 mM Tris-HCl, pH 7.5. Biotinylated proteins were detected on the blots after reaction with horseradish-peroxidase-conjugated streptavidin (Strep-HRP), and chemiluminescence visualization (ECL, Amersham Bioscience) according to the manufacturer instructions.

#### **MASS SPECTROMETRY AND PROTEIN IDENTIFICATION**

After separation by 2-D PAGE, discrete spots were detected based on Coomassie blue-staining and excised from the gel. Correspondence between Coomassie-blue stained and chemiluminescent spots were determined based on comparisons using the ImageJ software (NIH). Relative quantities of proteins were assessed based on the staining intensity. Since Coomassie staining is not linearly correlated with absolute quantities of proteins, analyses were based on relative intensities, when comparing treated and untreated gels, allowing only most striking differences to be measured. An in-gel digestion was carried out as described (Ferro et al., 2000). Gel pieces were extracted with 5% [v/v] formic acid solution and acetonitrile from a gel corresponding to an untreated or treated sample. For this evaluation study, only one gel per condition was analyzed by mass spectrometry. Extracted peptides were desalted using C18-Zip Tips (Millipore). Elution of peptides was performed with 5–10µl of a 50:50:0.1 (vol/vol) acetonitrile/H2O/formic acid solution. The tryptic peptide solution was introduced into a glass capillary (Protana, Odense, Denmark) for nanoelectrospray ionization. Tryptic peptide mass fingerprints were first assessed by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF/MS) analyses as described (Journet et al., 2000). Tandem mass spectrometry experiments were carried out on a Q-TOF hybrid mass spectrometer (Micromass). Interpretation of MS/MS spectra was achieved manually and with the help of the PEPSEQ program (MassLynx software, Micromass, Manchester, UK). MS/MS sequence information was used for database searching using the BLASTCOMP program (Ferro et al., 2002) performing BLAST searches for each amino acid sequence and clustering amino acid sequences identified from common BLAST hits. BLASTP and TBLASTN were used to mine plant protein and genomic databases, respectively.

# **RESULTS**

### **SELECTIVE SUPERFICIAL BIOTIN-TAGGING AND THERMOLYSIN-PROTEOLYSIS OF ISOLATED CHLOROPLASTS**

Superficial proteins from chloroplasts are potentially sensitive to non-permeable tagging or proteolysis. We sought therefore to "shave" or "tag" the surface of pure and intact chloroplasts isolated from spinach leaves.

"Shaving" was the easiest since treatment with the nonpermeable protease thermolysin is a well-established method to digest proteins accessible at the outer surface of the OEM (Dorne et al., 1982; Joyard et al., 1983). Based on previous works, thermolysin appears therefore as a protease of choice, which cannot access the IEM or stroma. After proteolytic digestion of surface proteins, intact chloroplasts were re-isolated on a Percoll cushion, and chloroplast sub-compartment (envelope, stroma, thylakoids) were fractionated.

Superficial protein tagging with a biotinyl group was described for *Escherichia coli*(Bradburne et al., 1993) and *Helicobacter pylori* (Sabarth et al., 2002). For our purpose, we used the biotinylation reagent 6-((6-((biotinoyl)amino)hexanoyl)amino)hexanoic acid, succinimidyl ester (biotin-XX,SE). To prevent passive diffusion of the reagent through the OEM porine, (i) biotin-XX,SE (Mr 568) was selected for its relative hydrophobicity, (ii) biotinylation reaction was short (15 min), and (iii) temperature was kept low (4◦C). By this mean, no biotinylation of IEM markers was detected. The succinimidyl ester reacts covalently with primary amino groups, i.e., the accessible *N*-termini and the ε-amino from lysyl residues. After biotinylation, intact chloroplasts were re-isolated on a Percoll gradient, and chloroplast sub-compartments (envelope, stroma, thylakoids) were fractionated. To control the efficiency of the biotinylation, after electrophoresis and transfer on nitrocellulose membranes, tagged proteins were visualized after reaction with streptavidin conjugated to horseradish peroxidase, and chemiluminescence detection. **Figure 1A** shows a 1-D SDS PAGE analysis of 20µg proteins from envelope, thylakoid and stroma fractions obtained from chloroplast treated in the absence (left) or presence (right) of biotin-XX,SE. **Figure 1B** shows the biotin detection in each fraction following streptavidin reaction. Antibodies raised against ketol-acid reductoisomerase (anti-KARI), a major polypeptide from spinach chloroplast stroma, and the α, β, and γ subunits of the ATP synthase coupling factor 1 (anti-CF1), a major protein from thylakoids, were used to analyze stroma and thylakoid fractions, respectively, (**Figure 1C**). The α and β subunits were detected in both the envelope and stromal fractions, with a relative enrichment of the β subunit in the envelope and of the α subunit in the stroma (**Figure 1C**). As expected, naturally biotinylated stromal proteins could be detected (**Figure 1B**, control). In **Figure 1B** a black arrow indicates the band migrating at the molecular weight of biotin carboxyl carrier protein (BCCP), a subunit of the acetyl-CoA carboxylase complex (Alban et al., 1995; Elborough et al., 1996). Additional naturally biotinylated plastid proteins (Elborough et al., 1996) were also visualized, including a major band migrating at a molecular weight of ∼50 kDa (**Figure 1B**, control) that might correspond to the unknown 50 kDa biotin-binding protein observed in rapeseed by Elborough et al. (1996). Only a few biotin-containing proteins have been characterized in plants, involved in the catalysis of carboxylation reaction or containing a non-catalytic biotin (Nikolau et al., 2003), including a geranoyl-CoA carboxylase protein, localized in the plastid in maize, and which gene sequence has to be determined. We did not characterize the streptavidine-binding proteins we observed in the stroma of spinach chloroplasts and do not propose any tentative identification for the corresponding bands. These results confirm that in our conditions, after treatment of isolated chloroplasts with the biotinylating reagent, the envelope fraction was the predominent compartment to be differentially tagged (**Figure 1B**, +20µM biot).

Envelope proteins of untreated chloroplasts or after treatment with biotin-XX,SE ("tagged" chloroplasts) or with thermolysin ("shaved" chloroplasts) were subjected to 2-D PAGE (see below). Surface proteins might exhibit primary amino groups possibly tagged with biotin. They might also exhibit

**FIGURE 1 | Specific chloroplast envelope biotinylation.** Proteins prepared from intact (control) or biotinylated (20µM Biot) spinach chloroplast sub-fractions (20µg) were separated by 1D-PAGE. Polypeptides were either stained with Coomassie brilliant blue G- 250 **(A)** or electro-transferred to nitrocellulose membranes **(B,C)**. **(B)** Biotin was detected after reaction with streptavidin coupled to horse-radish peroxidase (HRP). Naturally biotinylated proteins from the stroma are indicated in control conditions (black and white arrows, see text). **(C)** Western blot detection of the CF1 ATPase α, β, and γ subunits from thylakoids (anti-CF1 1/10.000) and of the ketol-acid reductoisomerase from the stroma (anti-KARI 1/5000). E, envelope; T, thylakoids; S, stroma.

leucine or phenylalanine residues possibly cleaved by thermolysin. Alternatively, some surface proteins may not exhibit any such residues and be not altered by "tag" or "shave" treatments. Thus, **Table 1** summarizes the simplest surface prediction of a protein



*(?) In the absence of biotinylation or sensitivity to thermolysin, although the polypeptide is probably not exposed at the surface of the plastid envelope, one cannot exclude a localization at the OEM.*

according to its sensitivity to the "tag and shave" treatments. A single biotinylation induces a molecular weight increase of ∼0.45 Da and no charge modification. As a result, 2-D PAGE resolution of biotinylated proteins should be nearly undistinguishable from the pattern of the corresponding non-biotinylated proteins.

#### **TWO-DIMENSIONAL ELECTROPHORESIS OF CHLOROPLAST SUBFRACTIONS**

We performed 2-D PAGE analyses of chloroplast sub-fractions based on methods previously developed to analyze thylakoid peripheral and lumenal proteins (Peltier et al., 2000; Schubert et al., 2002). We thus attempted to adapt the 2-D PAGE procedure for envelope samples in order to limit the poor yield in integral proteins as much as possible. Protein samples were loaded into the first electrophoretic dimension gel by rehydration of dried IEF strips (passive hydration of linear IPG strips). In most works, re-hydration of the IEF gel is usually carried out with an upper layer of mineral oil preventing water evaporation. By this technique, we noticed that after gel loading, Coomassie staining of the IEF rehydrated strip could detect little proteins, whereas substantial amounts of proteins were found in the mineral oil, with a 1-D SDS PAGE pattern close to that of chloroplast envelope (not shown). In addition to precipitation during IEF, membrane associated proteins were also lost by partition within the mineral oil before IEF loading. The 2-D PAGE resolutions reported here were therefore achieved after a 3-h passive hydration of linear IPG strips, without mineral oil over-layer, prior IEF and SDS 2-D PAGE. **Figure 2** shows the envelope protein 2-D PAGE resolution after IEF on a 4–7 pH gradient. About 300 spots were visible after Coomassie staining, out of which 85 were circled for further gel comparisons. Some spots, such as 22, 23, and 24 trivially correspond to the Rubisco LSU. In results shown below, numbered spots in the vicinity of an immunostained-, tagged- or shaved spot were indicated.

We then analyzed protein samples obtained after superficial tagging with biotin. **Figure 3** shows the comparative resolution of proteins from envelope, stroma and thylakoids by 2-D PAGE after IEF on a 4–7 pH gradient. Protein patterns were accurately focused for further mass spectrometry analyses. Biotinylation of envelope proteins could be globally visualized after 2-D PAGE (**Figure 3B**, left). Antibodies raised against the stromal ketol-acid reductoisomerase (anti-KARI), and the thylakoid ATP synthase coupling factor 1 (anti-CF1, α, β, and γ -subunits),

excised, digested, and analyzed by MALDI-ToF to yield a peptide mass fingerprint for databases searching. Numbers indicate spot numbers. Apparent molecular weight (Mr) in kDa are indicated on the left.

consistently reacted with polypeptides from the stroma and thylakoid fractions, respectively, (**Figure 3C**). Whereas immunocross-reactions were detected after 1-D PAGE of envelope proteins (see **Figure 1C**), immunostaining with anti-KARI or anti-CF1 was not detected after 2-D PAGE of envelope proteins (not shown). Similarly, the naturally biotinylated stromal proteins detected after 1-D PAGE (**Figure 1B**) were not detected after 2-D PAGE (**Figure 3B**, center). The chemiluminescence detection threshold of immunolabeled or tagged proteins is therefore lower after 2-D PAGE, probably reflecting the relatively lower yield of 2-D PAGE compared to 1-D PAGE, with a loss of parts of the hydrophobic proteins, a phenomenon we tried to control, but which should nevertheless be considered when analyzing results. An interesting consequence of this feature is that the immunostaining of a polypeptide resolved by 2-D PAGE indicates therefore a strong immunogenic reaction.

#### **IMMUNOSTAINING-BASED PROTEOMIC ASSESSMENT**

**Figure 4** shows the immunostaining-based identification of a set of envelope protein markers resolved by 2-D PAGE. Proteins were extracted from chloroplasts treated in the absence or presence of thermolysin, prior to sub-organellar fractionation (**Figure 4A**). Antibodies raised against OEP24, an OEM integral protein, IEP37, an IEM quinone methyl transferase and MGD1, the IEM MGDG synthase, reacted positively with envelope polypeptides (**Figures 4B–D**, left panels). The IEF of IEP37 was broad, an observed feature we could not explain. Main immunostaining was focused at the level of spot 26 (**Figure 4C**), in the acidic part of the gradient whereas the predicted IEP37 pI is 9.2 (Teyssier et al., 1996). Immunostaining at the level of spot 56 and to a lesser extent at the level of spot 57 possibly correspond to cross reactions of the antibody with minor proteins migrating at a slightly lower molecular weight compared to IEP37, with a higher pI and being sensitive to thermolysin treatment. Immunostaining of spinach chloroplast envelope proteins with the anti-IEP37 antibody can, in some cases, allow the detection of a broad band in 1D-PAGE (**Figure 4H**) possibly corresponding to some minor variations of the IEP37 sequence in the spinach leave samples we collected for our experiments. Two additional antibodies, raised against OEP10 and FtsZ2 associated to the OEM and IEM, respectively, also decorated envelope polypeptides (**Table 2** and **Figure 4E**). Although OEP10 size is 6.7 kDa, the migration of this hydrophobic polypeptide containing one transmembrane segment in our 2- D PAGE was at the apparent molecular weight of 110, as assessed by both immunostaining and mass spectrometry determination (**Table 2**). As expected from their known localization in the OEM or IEM (**Figure 4G**), OEP24 and OEP10 were "shaved" away by the thermolysin treatment, whereas IEP37 and MGD1 were still detected in thermolysin-treated samples (**Figures 4B–D**, right panels). These analyses provide therefore a control for the accuracy of the thermolysin treatment. Interestingly, FtsZ2 immunostaining was not observed in envelope membranes purified from thermolysin-treated chloroplast, a phenomenon reported earlier (El-Kafafi et al., 2005) and consistent with: (i) the association of this peripheral protein of the IEM to an trans-envelope complex and (ii) the dependence of this association on the integrity of some of its OEM components.

We sought whether the 2-D PAGE conditions we set up could accurately resolve basic membrane protein known to be particularly difficult to analyze by such technique. In pH 3– 10 immobilized gradient conditions, polypeptides are detected after Coomassie staining in basic parts of the IEF pH gradient (**Figure 4F**). The ceQORH polypeptide, a basic IEM protein identified after organic solvent partition and 1-D SDS PAGE (Ferro et al., 2000; Miras et al., 2002), was immunodetected (**Figure 4G**). Together these data validate the quality of the 2-D PAGE conditions we used, based on the presence of envelope protein markers, consistently sensitive to the thermolysin superficial proteolysis, in a wide range of IEF pH gradient.

#### **MASS SPECTROMETRY-BASED PROTEOMIC ASSESSMENT**

Additional proteins were identified after mass spectrometry analyses. For this preliminary evaluation, we restricted or analysis to about 30 spots, for which a major protein could be identified following analysis. Indeed, although spinach chloroplast was the ideal starting material for this technical evaluation, the lack of genomic information on spinach was a clear limit of our study to identify proteins by mass spectrometry at a large scale and with a high resolution. Numerous spots allowed the detection of multiple proteins, which relative abundance could not be inferred and were not analyzed further. After discrete excision of spots, polypeptides were digested by trypsin inside the polyacrylamide gel and trypic fragments were subjected to MALDI-ToF to yield a peptide mass fingerprint for database searching. Sequence identification was further confirmed by MS/MS tryptic peptides analyses. **Table 2** gives the list of proteins we assessed either by immunostaining or mass spectrometry analyses, with UniProt references of spinach proteins or corresponding homologs in *Arabidopsis* or pea in the absence of previously sequenced genes from spinach. Identified proteins include envelope membrane

proteins and expectedly soluble proteins, either in the cytosol or chloroplast stroma. Stromal subunits of the CF-1 ATP-synthase, i.e., α, β, and γ subunits were further inventoried. Among soluble cytosolic and stromal proteins, 11 were previously known as envelope associated (Tranel et al., 1995; Rolland et al., 2003; El-Kafafi et al., 2005; Pontier et al., 2007; Ferro et al., 2010).

#### **SELECTIVE BIOTINYLATION AND THERMOLYSIN-PROTEOLYSIS OF ENVELOPE PERIPHERAL PROTEINS**

**Table 2** collects information obtained from tag and shave experiments including proteins unaffected by neither of these treatments. This table includes proteins localized in the OEM or in other compartments of the chloroplast. It should not be considered as a comprehensive list, but rather as an evaluation of the tag and shave approach using the limited technique of 2D-PAGE. We noticed on few large spots (1, 23, 24), where the differential tagging or shaving was not homogenous, indicating that in these regions of the 2-D PAGE, more than one polypeptide was resolved in an apparent given spot. **Figure 5** gives an example of a magnified area illustrating that an outer envelope protein, OEP24 (spot 73) was assessed by immunostaining (**Figure 5B**, solid arrow), was tagged (**Figure 5D**) and shaved (**Figure 5C**). As listed in **Table 1**, other patterns of differential tag or shave are noticed, such as spot 69 being neither tagged nor shaved (**Figure 5**, white arrow), as expected for a polypeptide that does not protrude at the surface of the chloroplast. In **Table 2**, cytosolic and superficial proteins are consistently tagged and/or shaved.

Surprisingly, although spots 2–4 (stromal protease ClpC) and spot 25 (translocon Tic40) are untagged by biotin, consistently with their association to the IEM, they disappear from the 2- D PAGE map of envelope prepared from thermolysin-treated chloroplasts. After a treatment by thermolysin or trypsin, it has been previously shown that, in whole protein extracts from chloroplasts, Tic40 was not degraded (Chou et al., 2003; Ko et al., 2005). The disappearance of spot 25 (Tic40) following thermolysin treatment is therefore puzzling. This phenomenon

#### **FIGURE 4 | Continued**

membrane, anti-OEP24 **(B)** or the inner envelope membrane, anti-IEP37 **(C)**, anti-MGD1 **(D)** and anti-FtsZ2 **(E)**. **(F,G)** Spinach chloroplast envelope proteins (200µg) were passively loaded into 7 cm pH 3–10 IPG strips. Second-dimension separation was in 13% polyacrylamide gels. Proteins were either stained with Coomassie blue **(F)** or transferred to nitrocellulose membrane for western-blotting using

could not be attributed to an excessive proteolysis based on IEM protein markers, such as IEP37 and MGD1 that were still detected following thermolysin incubation. In pea (*Pisum sativum*), Tic40 has an apparent molecular weight of 44 kDa and has been previously detected in both IEM and OEM fractions (Ko et al., 1995), indicating a possible cohesive association with some OEM components; after incubation of pea chloroplast with thermolysin and analysis of envelope proteins, the treatment gave rise to a form with an apparent molecular weight of 42 kDa (Ko et al., 1995), which might explain that in our study, Tic40 might migrate to a position that differs from spot 25 of the 2-D PAGE. The cytosolic precursor of *Arabidopsis thaliana* Tic40 was also shown to be imported within chloroplasts in two steps, first as a soluble intermediate form, with an apparent molecular weight of 44 kDa and then as an IEM-associated form with an apparent molecular weight of 40.8 kDa (Li and Schnell, 2006). It is therefore also possible that the thermolysin treatment could affect the balance between an IEM-associated form and a soluble form of Tic40. Future works might help understanding the difference we observed. The parallel results observed for spots 2-4 (ClpC) and spot 25 (Tic40) is consistent with the co-immuno-precipitation of both Tic40 and ClpC with actin previously reported (Jouhet and Gray, 2009a,b; Franssen et al., 2011).

As mentioned above, a similar differential pattern was observed for FtsZ2 (**Figure 4E**). In the three cases we pointed here, the resolved spots match the mature Tic40, ClpC, or FtsZ2 proteins rather than their cytosolic precursors. These internal proteins are therefore inaccessible to thermolysin and their disappearance from the 2-D PAGE cannot be due to direct proteolysis. A possible common scenario could therefore be a disassembly from the IEM. Concerning FtsZ2, this division-ring component is indeed mostly present in the stroma of chloroplasts but also associated with the envelope as part of a trans-envelope complex that protrudes on the cytosolic side of the envelope (El-Kafafi et al., 2005; Falconet, 2012). Based on these three examples, which should nevertheless be confirmed by detailed analyses, in addition to OEM proteins detection, the tag and shave approach could also highlights IEM proteins which association to the IEM is strictly dependent on complexes protruding at the outer surface, and happens to be deeply destabilized when OEM components are cleaved by thermolysin-proteolysis.

## **DISCUSSION**

#### **TOPOLOGICAL AND STRUCTURAL INFORMATION BROUGHT BY THE "TAG AND SHAVE" STRATEGY AND 2-D PAGE BASED PROTEOMIC ANALYSIS**

It is still not known if an ideal electrophoretic technique would provide the exhaustive separation of both integral and peripheral envelope membrane-associated proteins, but the present study explores an optimized technique that might help completing the polyclonal antibodies against the recombinant Arabidopsis protein ceQORH **(G)**. **(H)** Standard SDS-PAGE analysis of spinach chloroplast envelope proteins used in the present study and western blotting using anti-IEP37, anti-0EP24, and anti-FtsZ2 polyclonal antibodies. **(I)** Schematic representation of protein markers. OEM, outer envelope membrane; IEM, inner envelope membrane. Apparent molecular weight (Mr) in kDa are indicated on the left.

inventory initiated by 1-D PAGE based proteomic analyses. The proteins we identified after 2-D PAGE resolution were mainly peripheral or soluble. The confidence in results, concerning particularly the possible cross-contaminations, depends strongly on the purity of the treated material. For technical reasons, we used spinach chloroplasts as a working model, because of the high amount of starting material and the possibility to repeat experiments, but it is clearly not the ideal material since we lack some genomic information. Based on our evaluation, with molecular markers of spinach chloroplast sub-compartments and the exploration of the validity and limits of the method, this work can now serve as the basis for a well characterized model at the genomic and proteomic scales, such as *Arabidopsis thaliana*, pea (Franssen et al., 2011) or *Brassica rapa* (Cheng et al., 2011).

We thus paid attention to the sub-organellar fractionation methods and controlled the purity of the chloroplast subfractions using antibodies raised against stroma and thylakoid markers (**Figures 1**, **3**). Some stromal proteins were associated, at least partly, to biotinylated spots, like the Fructose 1,6-bisphosphate aldolase or Cpn60-β (**Table 2**). The intensity of biotinylation in these two spots was low although the Coomassie staining was high. In the case of Fructose 1,6 bisphosphate aldolase, both cytosolic and stromal isoforms could be detected by mass spectrometry analyses (with marker peptides MVDVLIEQGIVPGIK and TVVSIPNGPSALAVK for the chloroplastic isoform; VTPEVIAEYTVR and TADGKPFVDAMK for the cytosolic one, **Table 2**). It is obvious that following this preliminary study, we need to determine whether polypeptides were tagged prior import, explaining the presence of biotinylated precursors of stromal proteins, or if multiple envelope proteins co-migrated at the same level of the 2-D PAGE, mixing polypeptides of various sub-compartments including biotinylated OEM proteins. Following this evaluation study, perspectives include the analysis of tagged/shave proteins with a method that does not depend on 2-D-PAGE. This is currently feasible using the high detection sensitivity of mass spectrometry applied to protein mixtures: search for biotin signatures in peptides analyzed by mass spectrometry will simply resolve this question.

Motivation for an additional chloroplast envelope proteomic analysis should either be to provide information that is not available in other studies. The "tag and shave" strategy intended therefore to bring topological information, i.e., exposure of peripheral proteins at the surface of the organelle (**Table 1**, **Figure 5**). Insights on membrane-associated proteins involved in the sorting of cytosolic protein precursors, such as chaperones and translocon components, in the maturation and assembly of proteins, particularly Rubisco, in the carbon metabolism or in the stromal RNA editing, could therefore be obtained (**Table 2**). The present study also highlighted the disappearance from the envelope fraction of well-characterized internal proteins **Table 2 | Preliminary analysis of a selection of chloroplast-associated polypeptides, following biotinylation or thermolysin treatment and 2D-PAGE resolution.**


*(Continued)*

#### **Table 2 | Continued**


*The tag and shave intensity is given as a scale:* −*, no effect on the characterized spot;* +*,* ++*,* +++*, increasing effects of the tag or shave treatment. For biotinylation, the scale is estimated by the average size of the biotinylated spots being smaller,* +*; identical,* ++*; or bigger,* +++*; when compared to the size of the overlapping Coomassie-stained spots. (*+*), partial tag or shave of a large 2-D PAGE spot. UniProt accession numbers were given from aSpinacea oleracea; bArabidopsis thaliana; and cPisum sativum. Abbreviations: Env, envelope; IEM, inner envelope membrane; Mit, mitochondria; OEM, outer envelope membrane; Thyl, thylakoïd; nd, not defined after Coomassie staining; Nucl, nucleus; Vac, vacuole. Previoulsy determined localization of proteins was obtained from works by 1Ferro et al. (2010), 2Ko et al. (1992), 3Heazlewood et al. (2007), and 4El-Kafafi et al. (2005).*

(Translocon Tic40 component, ClpC protease, FtsZ2), after treatment of intact chloroplasts by the non-permeable thermolysin protease (**Table 2**). The "tag and shave" strategy proved therefore to be informative on the stability of the association of these internal proteins to the IEM depending on the integrity of external superficial proteins associated to the OEM, shedding light on the importance of *trans-envelope* complexes stability. For example, FtsZ2 known to be involved in chloroplast division, has not been reported in proteomic analyses of pure envelope fractions from Arabidopsis (Ferro et al., 2010) although it binds to the IEM with a strength that depends on the integrity of some OEM proteins (El-Kafafi et al., 2005). FtsZ2 is therefore an obligate subunit of a stable trans-envelope complex. The results obtained with Tic40 would require a component of the inter-membrane space linking the IEM complex to the OEM, which presence and function should be demonstrated. **Figure 6** summarizes all possible patterns after a "tag and shave" analysis. Trans-envelope complexes, which stability depends on the integrity of OEM components, are illustrated by the schematic spots 6, 7, and 8. Below, we discuss the most striking biological processes that appear to occur in the close vicinity of the chloroplast envelope membranes, involving proteins that associate in strong or lose complexes.

#### **GENERAL PROCESSES OF CYTOSOLIC PROTEIN PRECURSOR IMPORT**

Most chaperone, co-chaperones, proteases, and translocon subunits we identified were previously reported as major envelopeassociated proteins (Rolland et al., 2003; Ferro et al., 2010). They are known to contribute to the general sorting processes of chloroplast protein precursors encoded in the nucleus. In the cytosolic side of chloroplasts, HSP70/Com70 molecular chaperones play a role in the handling of nucleus-encoded protein precursors (Zhang and Glaser, 2002; Salvucci, 2008; Vitlin et al., 2013). Cytosolic HSP70 proteins can specifically interact with targeting sequences of chloroplast precursors in a similar manner as mitochondrial precursors (Rial et al., 2000; Pontier et al., 2007). Chloroplast transit peptides differ from mitochondrial transit peptides by their phosphorylation on Ser and Thr residues in a phosphopeptide-binding motif for 14-3-3 proteins. The interaction of chloroplast precursors with the cytosolic HSP70 and 14-3-3 proteins was shown to enhance the translocation rate into chloroplasts (May and Soll, 2000). They make a "guidance complex" escorting the precursor to a specific TOC component (Li and Chiu, 2010; Lee et al., 2013). In this scenario, the guidance complex must be stable until it reaches OEM components. To our knowledge, no 14-3-3 protein was reported in envelope proteomic studies possibly due to its very dynamic destabilization and turnover or to the involvement of other escorting partners one might then find associated to the OEM (Flores-Perez and Jarvis, 2013). The presented strategy would then be the method of choice to identify these missing escorting proteins. The HSP70/Com70 protein we report here as a cytosolic protein partly associated to the chloroplast surface, might be the key protein that disrupts the guidance complex, once it reaches Toc34 (Lee et al., 2013). In the future, more sensitive analyses might also allow the detection of other components of the guidance complex.

The subsequent import of precursors implies the general translocon machinery, including several OEM (TOC) and IEM (TIC) subunits. Tic110 was shown to be the major component of the IEM import channel and is one of the most prominent proteins of the IEM (Soll, 2002). Tic110 is consistently found as a major protein in 2-D PAGE analyses performed here. On the stromal side of the IEM, Tic110 was also reported to be the binding site for molecular chaperones, including ClpC, a member of the HSP93-HSP100 family, and Cpn60. Here, both ClpC and Cpn60 were detected as stromal proteins associated to the IEM (**Table 2**). The dissociation of ClpC from the IEM occurred following treatment of chloroplasts with thermolysin (**Table 2**). This result indicates that the binding of ClpC depends on the integrity of proteins from the OEM. Tic40 also dissociated from the IEM when OEM proteins were subjected to proteolysis (**Table 2**). Tic40 is an IEM protein with a large hydrophilic domain in the stroma (Chou et al., 2003; Flores-Perez and Jarvis,

2013; Jarvis and Lopez-Juez, 2013). Cross-linking experiments showed that Tic40 is associated with Tic110, Toc75 and ClpC. The presence of Cpn60 could not be detected in Tic40 immunoprecipitates (Chou et al., 2003). Our study suggests that the strength of the association of Tic40-ClpC on the stromal side of Tic110 might depend on the integrity of protein components exposed at the outer surface of the chloroplasts. By contrast, Cpn60 association to Tic110 is not destabilized and involves therefore distinct binding mechanisms.

The initial association of protein precursors on the stromal side of the envelope is a fundamental event of protein import, because it brings a driving force for precursors through the translocon, bridging the TOC and TIC moieties, and because it prepares the accurate folding, processing, maturation and assembly on the stromal side. These processes involve general and specific chaperones, co-chaperones, co-factors and proteases (Li and Chiu, 2010; Lee et al., 2013). Here, we detect the stromal HSP70-DnaK homolog protein, ClpC, the ClpP1 subunit of the ClpP protease complex, the beta-subunit of Cpn60-GroEL and its co-chaperone Cpn21-GroES. All these proteins were previously reported in large-scale proteomic studies of chloroplast envelope membranes (Rolland et al., 2003; Ferro et al., 2010). ClpC is also known to direct specific proteins for degradation by the ClpP serine peptidase complex (Peltier et al., 2001), involved in the degradation of mistargeted or misfolded stroma proteins. In the turnover of TIC components, it is not known if ClpC could be involved in the degradation of some TIC proteins, like Tic40. The occurrence of both ClpC and a ClpP subunit encoded in the nucleus, i.e., ClpP5, at the periphery of the IEM, facilitates the route of some polypeptides toward ClpP degradation *via* ClpC. The ClpP5 subunit is still bound to the IEM after thermolysin treatment of chloroplasts (**Table 2**), whereas ClpC dissociates from the IEM under these conditions. Thus, although the ClpC chaperone and ClpP complex are topologically close and ready to interact, their association to the IEM is regulated differently *via* conformational status of other protein components.

The stromal Cpn60 chaperonin can form large tetradecamers, one containing the two stromal Cpn60 isoforms, i.e., Cpn60-α and Cpn60-β and the other consisting solely of Cpn60-β subunits (Dickson et al., 2000). The Cpn60-α/Cpn60-β tetradecamer is considered the major Cpn60 chaperonin in the stroma. Here, Cpn60-β is the sole subunit characterized in high amounts in the vicinity of the envelope and it is possible that a specific Cpn60-β tetradecamer might be associated to the IEM in spinach. The unique subunit that was detected in proteomic analyses of pure thylakoid membranes is the other isoform, Cpn60-α (Peltier et al., 2000). By contrast in *Arabidopsis*, only Cpn60-α could be found associated to the envelope (Ferro et al., 2010). The Cpn60- GroEL complex is known to form a central cavity that captures incompletely folded proteins. To that respect, Cpn21, a member of the GroEL family, was found associated to the envelope membranes (**Table 2**). This is consistent with the presence of a Cpn60/Cpn21 system that is functionally active in the IEM, in association with Tic110. Cpn60/Cpn21 is therefore topologically close to the imported protein precursors.

The chloroplast Cpn60 was initially identified as an abundant oligomeric protein that transiently binds the nascent large subunits of Rubisco, prior to their assembly into the Rubisco holoenzyme. Cpn60 chaperones are therefore often functionally annotated as a "Rubisco-binding protein." The characterization of Rubisco SSU and LSU, of Cpn60 and its co-chaperonine Cpn21 and of Rubisco SSU N-methyltransferase in tight contact with the envelope membrane (**Table 2**) might be useful to better understand Rubisco SSU import, processing, methylation and assembly with LSU. The Rubisco SSU N-methyltransferase detected here (**Table 2**) has not been previously characterized in proteomic analyses of pure envelope membranes (Rolland et al., 2003; Ferro et al., 2010). An association to the IEM is supported by measures of O- and N-methylation of Rubisco SSU in purified envelope fraction, although such modification is not ubiquitous in the plant kingdom, apparently not essential and possibly minor in spinach (Mininno et al., 2012). Interestignly, the methyltransferase was also shown to be effective on another substrate, the fructose 1,6 bisphosphate aldolase (Mininno et al., 2012) also detected here (**Table 2**). This preliminary analysis also provided information that might be useful to better comprehend ATP synthase subunits import, assembly and possible association to envelope membranes (**Table 2**).

#### **CONCLUSION**

In the present paper, we describe a method for a differential proteomic analysis of chloroplast envelope membrane peripheral

proteins after 2D-PAGE resolution and immunological and mass spectrometry-based protein assessments. A basic proteomic snapshot of the most abundant proteins detected after Coomassie staining was investigated after treatment of intact chloroplasts following a superficial protein "tagging" with biotin or a superficial protein "shaving" with thermolysin. This evaluation study supports that information can be collected on the exposure of some OEM proteins at the surface of the chloroplast, but also on internal protein components, which association to the IEM relies on the stability of trans-envelope protein complexes and on the integrity of some OEM components. Future perspectives include an in-depth analysis of the envelope membrane proteome of "tagged or shaved" samples, using a more accurate plant model, such as Arabidopsis, with carefully purified chloroplasts. Perspectives include the analysis of the chloroplast envelope proteome using, more sensitive mass spectrometry analytical methods. The systematic analysis of biotinylated peptides by mass spectrometry (based on the mass shift introduced by biotin) will be a simple way to analyze the topology of OEM proteins, with possible cross contaminations by IEM or stromal precursors biotinylated in the course of their import. Following this evaluation of the method, the "tag and shave" strategy is therefore promising to bring refined topological information in large scale analyses. It could also be implemented, once validated, in the characterization of other membrane-limited organelles such as mitochondria.

### **ACKNOWLEDGMENTS**

Authors whish to thank M. Neuburger for setting up biotintagging conditions. The authors were supported by Agence Nationale de la Recherche (ANR-10-BLAN-1524, ReGal; ANR-12-BIME-0005, DiaDomOil; ANR-12-JCJC, ChloroMitoLipid), Région Rhône-Alpes and the Labex GRAL (Grenoble Alliance for Integrated Structural Cell Biology).

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 January 2014; accepted: 25 April 2014; published online: 16 May 2014. Citation: Hardré H, Kuhn L, Albrieux C, Jouhet J, Michaud M, Seigneurin-Berny D, Falconet D, Block MA and Maréchal E (2014) The selective biotin tagging and thermolysin proteolysis of chloroplast outer envelope proteins reveals information on protein topology and association into complexes. Front. Plant Sci. 5:203. doi: 10.3389/ fpls.2014.00203*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Hardré, Kuhn, Albrieux, Jouhet, Michaud, Seigneurin-Berny, Falconet, Block and Maréchal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### *Akari Tada1, Fumi Adachi 1,Tomohiro Kakizaki <sup>2</sup> and Takehito Inaba1\**

<sup>1</sup> Department of Agricultural and Environmental Sciences, Faculty of Agriculture, University of Miyazaki, Miyazaki, Japan <sup>2</sup> National Institute of Vegetable and Tea Science, Tsu, Japan

#### *Edited by:*

Kentaro Inoue, University of California at Davis, USA

#### *Reviewed by:*

Toshiaki Mitsui, Niigata University, Japan Mark Findlay Belmonte, University of Manitoba, Canada Meshack Afitlhile, Western Illinois University, USA

#### *\*Correspondence:*

Takehito Inaba, Department of Agricultural and Environmental Sciences, Faculty of Agriculture, University of Miyazaki, 1-1 Gakuenkibanadai-nishi, Miyazaki 889-2192, Japan e-mail: tinaba@cc.miyazaki-u.ac.jp Biogenesis of chloroplasts is essential for plant growth and development. A number of homozygous mutants lacking a chloroplast protein exhibit an albino phenotype. In general, it is challenging to grow albino Arabidopsis plants on soil until they set seeds. Homozygous albino mutants are usually obtained as progenies of heterozygous parents. Here, we describe a method of recovering seeds from the seedling lethal Arabidopsis mutant ppi2-2, which lacks the atToc159 protein import receptor at the outer envelope membrane of chloroplast. Using plastic containers, we were able to grow homozygous ppi2-2 plants until these set seed. Although the germination rate of the harvested seeds was relatively low, it was still sufficient to allow us to further analyze the ppi2-2 progeny. Using ppi2-2 homozygous seeds, we were able to analyze the role of plastid protein import in the light-regulated induction of nuclear genes. We propose that this method be applied to other seedling lethal Arabidopsis mutants to obtain homozygous seeds, helping us further investigate the roles of plastid proteins in plant growth and development.

**Keywords: albino,** *Arabidopsis***, chloroplast,** *ppi2-2* **mutant, protein import, seed**

#### **INTRODUCTION**

Plastids such as chloroplasts in photosynthetic plant cells are believed to have evolved from a cyanobacterium-like ancestor (Dyall et al., 2004). During evolution, most of the genes encoded by the bacterial ancestor were transferred to the nuclear genome of the host. Therefore, the expression of nuclear genes encoding plastid proteins and the import of those proteins into plastids are essential for plastid biogenesis. The key player involved in delivering nuclear-encoded proteins into plastids is the translocon at the outer envelope membrane of chloroplasts (TOC) and the translocon at the inner envelope membrane of chloroplasts (TIC) complex (Inaba and Schnell, 2008; Li and Chiu, 2010; Jarvis and Lopez-Juez, 2013). The TOC–TIC complex was first isolated through biochemical purification (Kessler et al., 1994; Schnell et al., 1994). Molecular genetic analysis of identified components using *Arabidopsis* indicated that these were indeed indispensable for plastid biogenesis (Jarvis et al., 1998; Bauer et al., 2000; Chou et al., 2003; Constan et al., 2004a,b; Ivanova et al., 2004; Kubis et al., 2004; Inaba et al., 2005; Kovacheva et al., 2005; Teng et al., 2006; Kikuchi et al., 2013).

Because of their key roles in plastid protein import, a number of mutants defective in TOC or TIC proteins exhibit severe developmental arrest, resulting in embryo and seedling lethality. These lethal phenotypes have made it difficult to characterize in more detail the roles of the TOC–TIC complex in plant growth and development. For instance, the homozygous *plastid protein import 2* (*ppi2*) mutant that lacks the major protein import receptor of plastids, atToc159, exhibits seedling lethality due to its severe albino phenotype (Bauer et al., 2000; Kakizaki et al., 2009). Therefore, we can only obtain bulk seeds from heterozygous *ppi2* (*ppi2/*+) plants. When the progeny of *ppi2-2/*+ is grown in the dark, it is virtually impossible to discriminate between homozygous *ppi2-2* and the heterozygous *ppi2-2/*+. Hence, to further uncover the role of plastid protein import in plant growth and development, it is necessary to propagate seeds from seedling lethal, albino mutants such as the homozygous *ppi2*.

In this paper, we describe a method for generating viable seeds from the seedling lethal *Arabidopsis* mutant *ppi2-2*, which lacks the major protein import receptor of plastids (Bauer et al., 2000; Kakizaki et al., 2009). Using these seeds, we investigated the photomorphogenic response of the *ppi2-2* mutant and showed that the TOC–TIC pathway and the light-induced gene expression are tightly coordinated with each other. Our method also provides clues on how to obtain viable seeds from other albino *Arabidopsis* plants, allowing us to uncover the roles of plastid proteins in plant growth and development in more detail.

# **MATERIALS AND METHODS**

#### **PLANT MATERIALS**

All experiments were performed on *Arabidopsis thaliana* accession Columbia (Col-0). The *ppi2-2* mutant has been described elsewhere (Kakizaki et al., 2009). Wild-type and *ppi2-2/*+ seeds were obtained from plants grown on soil.

#### **GROWTH CONDITIONS FOR RECOVERING HOMOZYGOUS** *ppi2-2* **SEEDS**

An overview of the growth method is summarized in **Figure 1A**. The progeny of *ppi2-2/*+plants were first grown on plates (150 mm in diameter) containing 0.5% agar, 1% sucrose, and 0.5× MS salts at pH 5.8. To synchronize germination, all seeds were maintained at 4◦C for 2 days after sowing. Plants were grown under continuous white light (80 μmol m−<sup>2</sup> s <sup>−</sup>1, unless specified) at 22◦C and 50% relative humidity in a growth chamber (LPH-350S, NK system). After 14–18 days, homozygous *ppi2-2* plants were transferred to small size, round-shaped Ziploc® containers (width 108 mm × depth 108 mm × height 56 mm, 236 ml container size, Asahi Kasei Co. Ltd., Japan; see **Figures 1B** and **2**) containing 0.8% agar, 3% sucrose, and 0.5× MS salts at pH 5.8. We believe this container is most similar to the Ziploc® brand Container with the Smart Snap® Seal Extra Small Bowl (8 ounces, S.C. Johnson & Son, Inc., Howe St Racine,WI, USA) in the United State. Typically, 14–18 days old *ppi2-2* plants have four to six small true leaves. It is important to choose well-developed *ppi2-2* plants for subsequent cultivation in Ziploc® containers. To avoid excess humidity and facilitate air circulation in the pot, each pot had four holes that were sealed with two layers of surgical tape (**Figure 1**). We made those holes using a knife. In most cases, we placed five to seven plants in each pot. At this point, the lid was tightly sealed and taped with surgical tape (**Figure 1A**, middle). We continued to grow the plants until they started bolting (**Figure 2A**, right). Once the plants

in MS plates (left) and then transferred into a Ziploc® container (middle). The Ziploc® container has four holes covered with double-layered surgical tape. At a later stage, the lid of the Ziploc® container was partially opened (right), and a gap between the lid and the container was sealed with double-layered surgical tape. **(B)** Ziploc® container used in this study. Bar = approximately 1 cm.

**FIGURE 2 | Phenotype of homozygous** *ppi2-2* **plants grown in Ziploc**® **containers. (A)** Representative phenotype of wild-type (left) and ppi2-2 (right) plants grown in Ziploc® containers. After transferring the plants from plates, they were grown in a Ziploc® container for 10 days. Bar = approximately 1 cm. **(B)** At a later stage, the ppi2-2 plants set seeds in the Ziploc® container. Bar = approximately 1 cm.

had started bolting, the lid was partially opened and taped with two layers of surgical tape (**Figure 1B**). The plants were then harvested after they set seeds (**Figure 2B**). The harvested plants were dried in envelopes under laboratory condition for at least 2 weeks and then the seeds were collected.

#### **IMPORTANT NOTES**


#### **GROWTH CONDITIONS FOR LIGHT EXPOSURE EXPERIMENTS**

Plants were grown on 0.5% agar medium containing 1% sucrose and 0.5× MS salts at pH 5.8. To synchronize germination, all seeds were maintained in the dark at 4◦C for 3 days after sowing. After low temperature treatment, seeds were exposed to white light for 8 h at 22◦C and then returned to the dark for 4 days. Dark-grown plants were harvested and frozen in liquid nitrogen under a dim green light. A fraction of the dark-grown plants was then exposed to continuous white light for 24 h. After exposure to continuous white light, the plants were harvested and ground in liquid nitrogen for subsequent analysis.

#### **RNA ISOLATION AND REAL-TIME PCR ANALYSIS**

Total RNA was extracted from aerial tissues of wild-type and mutant plants using an RNAiso plus reagent (Takara) as described elsewhere (Kakizaki et al., 2009). We prepared three independent RNA samples for each treatment. Each RNA sample was prepared from ∼30 plants (equivalent to three spots in **Figure 4A**). cDNA was then synthesized using the PrimeScriptTM RT reagent kit (Takara) using a random hexamer and oligo d(T) primers. Real-time PCR was performed on a Thermal Cycler Dice Real-Time System (Takara) using SYBR Premix Ex Taq II (Takara) as previously described (Kakizaki et al., 2009). The primers used for real-time PCR are listed in **Table 1**. The transcript level of each gene was normalized to that of *ACTIN2*.

#### **RESULTS**

#### **CHARACTERIZATION OF SEEDS HARVESTED FROM HOMOZYGOUS** *ppi2-2* **MUTANTS**

We harvested the seeds from the 10 Ziploc® containers (∼50 plants). The yield of seeds in each experiment depended on the condition of the *ppi2-2* plants in the container. After four independent experiments, we obtained 0.26 g of homozygous *ppi2-2* seeds.

To determine whether the harvested seeds could be used for further analysis, we next examined the seeds harvested from wild-type, heterozygous *ppi2-2* (*ppi2-2/*+), and *ppi2-2* plants by microscopy. As shown in **Figure 3A**, the *ppi2-2* seeds were elongated in shape compared to those of the wild-type and *ppi2-2/*+ seeds. Nonetheless, most of these elongated seeds did not look like aborted seeds.

When the wild-type, *ppi2-2/*+, and *ppi2-2* seeds were sown on MS plates, at least 90% of the wild-type and *ppi2-2/*+ seeds germinated (**Figure 3B**). In contrast, only 60% of the *ppi2-2* seeds germinated 6 days after their transfer to the growth chamber (**Figure 3B**). The reason *ppi2-2* seeds exhibited a low germination rate remains unclear. Seed development is divided into two major phases, designated as the embryo and endosperm development phase and the seed maturation phase (West and Harada, 1993). Our previous observation suggested that embryo development of *ppi2-2* seeds was normal (Kakizaki et al., 2009). Consistent with this observation, the germination rate of *ppi2-2* seeds harvested from the *ppi2-2/*+ plants was normal (**Figure 3B**, 25% of the *ppi2-2/*+ progeny was *ppi2-2*). Hence, a possible explanation for the low germination rate of *ppi2-2* seeds is insufficient maturation due to high humidity in the pots. It is also possible that the growth retardation of *ppi2-2* plants affects the seed development on those plants.

We also investigated the phenotype of seedlings germinated from the wild-type, *ppi2-2/*+, and *ppi2-2* seeds. As shown in **Figures 3B,C**, all of the wild-type progeny exhibited a green phenotype, whereas ∼25% of the *ppi2-2/*+ progeny was albino. In contrast, 100% of the *ppi2-2* progeny exhibited an albino phenotype (**Figures 3B,C**).

In conclusion, we were able to recover viable seeds from homozygous *ppi2-2* seedlings. Furthermore, all of the *ppi2-2* progeny exhibited an albino phenotype. Although the germination rate of *ppi2-2*wasrelatively low, it was still sufficient to further analyze its progeny. We conclude that our method allows us to harvest seeds from the seedling lethal mutant *ppi2-2*.

#### **PHOTOMORPHOGENIC RESPONSE IN** *ppi2-2* **MUTANTS: THE NEW METHOD HELPS US FURTHER UNDERSTAND THE ROLES OF PLASTID PROTEINS IN PLANT GROWTH AND DEVELOPMENT**

Obtaining viable seeds from seedling lethal albino plants helps us uncover the roles of plastid proteins in plant growth and

#### **Table 1 | List of gene-specific primers used in real-time PCR analysis.**



derived from wild-type (WT), heterozygous ppi2-2 (ppi2-2/+), and homozygous ppi2-2 (ppi2-2) plants. Black bars = approximately 1 mm. **(B)** Analysis of germination rate and seedling phenotypes. Phenotypes were determined 6 days after the transfer of the plates to a growth chamber. Some of those seedlings could not be designated as green or albino due to growth retardation and thus were classified as "not determined (N.D.)." **(C)** Phenotype of seedlings germinated from wild-type, ppi2-2/+, and ppi2-2 seeds. Arrowheads indicate albino plants. White bars = approximately 1 cm.

development in more detail. For instance, the role of plastid protein import in a photomorphogenic response remains unclear. This is because we cannot discriminate between *ppi2-2/*+ and *ppi2-2* seedlings grown in the dark.When wild-type, *ppi2-2/*+, and *ppi2-2* seeds were germinated and grown in the dark for 4 days, all of the plants exhibited an etiolated phenotype regardless of their genotype (**Figure 4A**, upper panel). When wild-type plants were exposed to continuous white light for 24 h, their cotyledons turned green (**Figure 4A**, lower left panel). In contrast, *ppi2-2* plants opened their cotyledons after 24 h of light illumination, although they did not turn green (**Figure 4A**, lower right panel). The *ppi2- 2/*+ progeny turned green upon light illumination, whereas the segregated *ppi2-2* plants did not exhibit greening (**Figure 4A**, lower middle panel).

We next investigated the expression of photosynthesis-related genes upon light illumination in *ppi2-2* plants. For this analysis, we chose nuclear-encoded genes involved in photosynthetic electron transport (*PsaF*, *PsbO1*, and *LHCB3.1*) and CO2 fixation (*SSU1A*). In the wild-type, the expression of photosynthesis-related genes such as *PsaF*, *PsbO1*, *SSU1A*, and *LHCB3.1* was induced upon light illumination (**Figure 4B**, left panel). In contrast, light induction of these genes was compromised in the *ppi2-2* mutant (**Figure 4B**, left panel). We also confirmed that the expression of non-photosynthetic genes encoding pyruvate dehydrogenase E1α subunit (*PDH-E1*α) and ubiquitin conjugating enzyme (*UBC*). These genes have been shown to be expressed constitutively in*Arabidopsis* (Ivanova et al., 2004; Czechowski et al., 2005). As shown in **Figure 4B** (right panel), they did not show strong induction upon illumination. These data indicate that functional TOC machinery is a prerequisite for the rapid induction of photosynthesis-related genes upon light illumination. This also suggests that a tight coordination between plastid protein import and light-regulated gene expression would help prevent the accumulation of non-imported precursor proteins in the cytosol.

*ppi2-2* **(***ppi2-2***) seeds. (A)** Phenotype of wild-type, ppi2-2/+, and ppi2-2 progenies grown in the dark (Dark). The plants were then exposed to continuous white light for 24 hr (Dark + 24 hr light). Arrowheads indicate homozygous ppi2-2 plants found in the ppi2-2/+ progeny.

real-time PCR and normalized to that of ACTIN2. The expression level in the light-illuminated wild-type (WT 24-h light) was set to 1. Each bar represents the mean of three independent samples. Error bars represent 1 SE.

#### **CONCLUSION AND POSSIBLE APPLICATIONS**

This report describes a method for obtaining viable seeds from the seedling lethal *ppi2-2* mutant using Ziploc® containers. Four independent experiments allowed us to obtain ∼10,000 (0.26 g) homozygous *ppi2-2* seeds. We also confirmed that at least 60% of the harvested seeds were able to germinate (**Figure 3B**). Establishment of this new growth method allowed us to analyze the roles of plastid protein import in light-induced gene expression. Characterization of the homozygous *ppi2-2* progeny revealed that *ppi2-2* mutants failed to induce photosynthesisrelated genes upon light illumination (**Figure 4**). This indicates that the integrity of the protein import apparatus plays a critical role in the rapid induction of photosynthesis-related genes upon light illumination. In contrast, the *ppi2-2* mutation did not affect cotyledon opening upon light illumination (**Figure 4A**). These results demonstrated that investigations involving the progeny of homozygous albino plants will help us further understand the role of plastid protein import in plant growth and development.

Although we used *ppi2-2* plants as the model in our experiments, we would like to propose that the method could be applied to other*Arabidopsis* mutants that are described as"seedling


OM, outer envelope membrane; IM, inner envelope membrane; Env, envelope membrane; Thy, thylakoid membrane.

lethal." **Table 2** shows a comprehensive list of seedling lethal *Arabidopsis* mutants lacking either an outer or inner envelope membrane protein in chloroplasts. Because many mutants that are defective in an envelope membrane protein exhibit an embryo lethal phenotype (Hormann et al., 2004; Baldwin et al., 2005; Inaba et al., 2005; Patel et al., 2008; Hsu et al., 2010), the number of albino mutants that can be used for investigations is limited. Instead, we found that the method can be extended to other albino mutants. It has been suggested that a number of seedling lethal *Arabidopsis* mutants are associated with chloroplast dysfunction (Budziszewski et al., 2001). An exhaustive analysis of *Arabidopsis* mutants lacking a chloroplast protein revealed that more than 50 mutants exhibited albino, pale green, and other chloroplast-associated phenotypes (Myouga et al., 2010). Hence, we anticipate that seeds from some of these albino/seedling lethal mutants can be obtained using our method.

In summary, we have developed a method for recovering viable seeds from the seedling lethal *Arabidopsis* mutant *ppi2-2*. Our method can be applied to other albino *Arabidopsis* mutants, helping us further understand the roles of chloroplast proteins in plant growth and development.

#### **ACKNOWLEDGMENTS**

This work was supported by Grant-in-Aid for Young Scientists (B, no. 25850073) and Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation from MEXT, and a grant for Scientific Research on Priority Areas from the University of Miyazaki.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 March 2014; accepted: 13 May 2014; published online: 04 June 2014. Citation: Tada A, Adachi F, Kakizaki T and Inaba T (2014) Production of viable seeds from the seedling lethal mutant ppi2-2 lacking the atToc159 chloroplast protein import receptor using plastic containers, and characterization of the homozygous mutant progeny. Front. Plant Sci. 5:243. doi: 10.3389/fpls.2014.00243*

*This article was submitted to Plant Cell Biology, a section of the journal Frontiers in Plant Science.*

*Copyright © 2014 Tada, Adachi, Kakizaki and Inaba. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

# OPEN ACCESS

Articles are free to read, for greatest visibility

#### TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

#### COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org