# EMERGING ROLES OF LONG NONCODING RNAs IN NEUROLOGICAL DISEASES AND METABOLIC DISORDERS

EDITED BY: Yingqun Huang, Romano Regazzi and William Cho PUBLISHED IN: Frontiers in Genetics

#### *Frontiers Copyright Statement*

*© Copyright 2007-2015 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-571-8 DOI 10.3389/978-2-88919-571-8

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **EMERGING ROLES OF LONG NONCODING RNAs IN NEUROLOGICAL DISEASES AND METABOLIC DISORDERS**

Topic Editors: **Yingqun Huang,** Yale University School of Medicine, USA **Romano Regazzi,** University of Lausanne, Switzerland **William Cho,** Queen Elizabeth Hospital Kowloon, Hong Kong

Long noncoding RNAs (lncRNAs) are a new class of transcripts that are in general longer than 200 nucleotides and that have no protein-coding potential. The vast majority of vertebrate genomes encode diverse and complex lncRNAs that play regulatory roles at almost every step of gene expression. Recently, increasing evidence has implicated lncRNAs in the pathogenesis of various human diseases.

The purpose of the Research Topic, "Emerging roles of long noncoding RNAs in neurological diseases and metabolic disorders", is to bring together leading researchers in the field who, through contributing to an organized and comprehensive collection of peer-reviewed articles, provide a broad perspective on the latest advances in the field.

A number of interesting and cutting-edge areas will be covered as below, but this list is not exclusive:


**Citation:** Huang, Y., Regazzi, R., Cho, W., eds. (2015). Emerging Roles of Long Noncoding RNAs in Neurological Diseases and Metabolic Disorders. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-571-8

# Table of Contents


# Out of darkness: long non-coding RNAs come of age

#### *Yingqun Huang1 \*, Romano Regazzi <sup>2</sup> and William C. Cho3*

*<sup>1</sup> Obstetrics, Gynecology, and Reproductive Sciences, Yale University School of Medicine, New Haven, CT, USA*

*<sup>2</sup> Department of Fundamental Neurosciences, University of Lausanne, Lausanne, Switzerland*

*<sup>3</sup> Department of Clinical Oncology, Queen Elizabeth Hospital, Kowloon, Hong Kong*

*\*Correspondence: yingqun.huang@yale.edu*

#### *Edited and reviewed by:*

*Florent Hubé, UMR7216 Epigenetics and Cell Fate, France*

**Keywords: lncRNAs, metabolism and obesity, neuronal disorder, beta-cell dysfunction, developmental cognitive neuroscience**

It has been known for a number of years that only about 2% of this RNA encodes proteins. However, numerous studies employing both tiling arrays and high-throughput sequencing found that the genome is pervasively transcribed, with most DNA copied, at least at some point in time, into RNA. Indeed, Birney et al. (2007) estimated that 93% of the human genome is transcribed. Because of a dearth of functional information about such transcripts, the concept of widespread non-coding regions became the "dark matter" of the genome (Johnson et al., 2005) and in recent years there has been an explosion of research in this area. Due to technical and theoretical considerations, transcripts longer than 200 nucleotides and lacking the potential to be translated have been coined "long non-coding RNAs" or lncRNAs. Owing to thousands of these new transcripts that have been identified [The current GENCODE v20 estimates close to 15,000 independent lncRNAs in humans, much of the work laid on the discovery and characterization. Actually, some lncRNAs are very abundant and have been studied for many years (e.g., Xist RNA and H19 RNA), many of the others are expressed at much lower levels. Do they represent transcriptional noise? Are they often artifacts of sequencing? We are now emerging to get answers to these important questions. The past several years have witnessed striking progress in the functional characterization of many lncRNAs and a picture is now showing an enormously complex collection of transcripts, many of which are not at all inert, but rather play critical roles in cell function, gene regulation, and the development of disease (Morris and Mattick, 2014). Interestingly, lncRNAs can localize to the cytoplasm or nucleus, bind the proteins and other RNA molecules in mediating important intracellular interactions. Thus, among other functions, some have been shown to act as chromatin regulators, some influence transcription as enhancerassociated RNAs, some are host genes for smaller RNAs such as miRNAs and sno-RNAs and some act to sequester and modulate the function of miRNAs.

While much remains to be learned, we are truly at the frontier of important discoveries in the lncRNA field. The articles in this special issue continue this exciting trend connecting lncRNAs and cellular function, focusing particularly on their roles in development, metabolism, and association with the disease.

Several papers address the connection between lncRNAs and metabolism. Kameswaran and Kaestner (2014) discuss the growing evidence that lncRNAs can play an important role in the control of pancreatic beta-cell function and in diabetes manifestation. They particularly focus on lncRNAs generated from imprinted loci, where expression only occurs from either the maternal or paternal allele. Pullen and Rutter (2014) describe how genome-wide association studies have provided insights into ways in which lncRNAs can affect beta-cell identity and diabetes susceptibility. Esguerra and Eliasson (2014) describe the discovery and functional analysis of thousands of lncRNAs in the pancreatic islets of Langerhans and discuss how these transcripts might affect islet development and endocrine cell functions, and how understanding their biology might lead to therapeutic insights for the treatment of type 2 diabetes. In addition, Kornfeld and Bruning (2014) review the functional connection between lncRNAs, differentiation and homeostasis of metabolic tissues.

The role of lncRNAs in the nervous system are also addressed. Clark and Blackshaw (2014) and Vucicevic et al. (2014) review the current state of research on the emerging roles of lncRNAs in nervous system development and provide insights into how some of these might contribute to neurological pathologies. Kadakkuzha et al. (2014) contribute an original research article on the molecular characterization and functional analysis of the expression, localization and action of a lncRNA from the marine snail *Aplysia californica*, which is a natural antisense RNA from the sensorin gene and it plays an important role in neuronal function and aging.

Finally, there is an opinion article (Kohtz, 2014) by Kohtz underling the importance of interpreting the results with caution from studies on lncRNA function gleaned from cell culture model systems since they may not always accurately show us their natural *in vivo* functions.

# **REFERENCES**


microarray tiling experiments. *Trends Genet*. 21, 93–102. doi: 10.1016/j.tig.2004. 12.009


Vucicevic, D., Schrewe, H., and Orom, U. A. (2014). Molecular mechanisms of long ncRNAs in neurological disorders. *Front. Genet.* 5:48. doi: 10.3389/fgene. 2014.00048

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 September 2014; accepted: 22 October 2014; published online: 07 November 2014.*

*Citation: Huang Y, Regazzi R and Cho WC (2014) Out of darkness: long non-coding RNAs come of age. Front. Genet. 5:388. doi: 10.3389/fgene.2014.00388*

*This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Huang, Regazzi and Cho. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Functional implications of long non-coding RNAs in the pancreatic islets of Langerhans

# *Jonathan L. S. Esguerra\* and Lena Eliasson\**

Islet Cell Exocytosis, Department of Clinical Sciences-Malmö, Lund University Diabetes Centre, Lund University, Malmö, Sweden

#### *Edited by:*

Romano Regazzi, University of Lausanne, Switzerland

#### *Reviewed by:*

Romano Regazzi, University of Lausanne, Switzerland Lori Sussel, Columbia University, USA

#### *\*Correspondence:*

Jonathan L. S. Esguerra and Lena Eliasson, Islet Cell Exocytosis, Department of Clinical Sciences-Malmö, Lund University Diabetes Centre, Lund University, Jan Waldenströms Gata 35, Malmö 205 02, Sweden e-mail: jonathan.esguerra@med.lu.se; lena.eliasson@med.lu.se

Type-2 diabetes (T2D) is a complex disease characterized by insulin resistance in target tissues and impaired insulin release from pancreatic beta cells. As central tissue of glucose homeostasis, the pancreatic islet continues to be an important focus of research to understand the pathophysiology of the disease. The increased access to human pancreatic islets has resulted in improved knowledge of islet function, and together with advances in RNA sequencing and related technologies, revealed the transcriptional and epigenetic landscape of human islet cells. The discovery of thousands of long non-coding RNA (lncRNA) transcripts highly enriched in the pancreatic islet and/or specifically expressed in the beta-cells, points to yet another layer of gene regulation of many hitherto unknown mechanistic principles governing islet cell functions. Here we review fundamental islet physiology and propose functional implications of the lncRNAs in islet development and endocrine cell functions. We also take into account important differences between rodent and human islets in terms of morphology and function, and suggest how species-specific lncRNAs may partly influence gene regulation to define the unique phenotypic identity of an organism and the functions of its constituent cells. The implication of primate-specific lncRNAs will be far-reaching in all aspects of diabetes research, but most importantly in the identification and development of novel targets to improve pancreatic islet cell functions as a therapeutic approach to treat T2D.

**Keywords: pancreatic islets, beta-cells, insulin, glucagon, long non-coding RNA, type-2 diabetes, primate-specific**

### **INTRODUCTION**

The field of pancreatic islet research has largely benefited from the use of animal models, particularly in the investigation of molecular processes governing islet development and functions using rodents. While many features of rodent islets have been observed to reflect those found in humans, the increasing availability of human pancreatic islets for basic cell physiological and histological research has made it evident that important differences exist in terms of islet architecture (Brissova et al., 2005; Cabrera et al., 2006; Fujita et al., 2011; Wang et al., 2013), and islet cell functions (Ashcroft and Rorsman, 2012; Rorsman and Braun, 2013; Gylfe and Gilon, 2014). Moreover advances in RNA sequencing (RNA-seq) and complementary "transcriptome annotation" technologies enabled the exploration of the transcriptional and epigenetic landscape of human islet cells in unprecedented resolution revealing both short and long non-coding RNAs (lncRNAs) important for islet function (Eliasson and Esguerra, 2014). Here we will focus on lncRNA transcripts in pancreatic islets. The presence of lncRNAs highly enriched in the pancreatic islet and/or specifically expressed in the beta-cells provides a rich source of novel modes of gene regulation governing islet functions. Moreover, recent studies suggest the presence of a large proportion of primate-specific lncRNAs (Derrien et al., 2012). While evolutionary conservation of gene loci across broad phyla strongly points to a functional role of corresponding gene products, and hence more likely contribute to the phenotype, we would like to argue that species-specific transcripts play a significant role in defining

the unique phenotypic identity. Currently however, experimental data that links primate-specific lncRNAs to the human islet phenotype are limited. It will be a daunting task to evaluate the connection between primate-specific lncRNAs and human islet specific protein expression. In this review we try to answer the questions: (1) what are the potential roles of lncRNAs in islet development and endocrine cell function and, (2) can we attribute many of the observed islet morphological and functional differences in different species on non-evolutionary-conserved lncRNAs?

# **PANCREATIC ISLETS OF LANGERHANS AND TYPE-2 DIABETES**

The islets of Langerhans in the pancreas are central to carbohydrate homeostasis in higher metazoans. In vertebrates, the islets are composed primarily of alpha and beta cells which secrete glucagon and insulin, respectively. Glucagon triggers glycogenolysis in the liver where glycogen reserves are converted into glucose prior to release into the blood stream. Glucagon also mediates control of glucose production by triggering the phosphorylation of key enzymes that either inhibit glucose-requiring glycolysis or stimulate gluconeogenesis (Gromada et al., 2007). Hence, glucagon is secreted during periods of hypoglycemia when blood glucose levels are low, such as during starvation, fasting or exercise. Patients with T2D have increased glucagon secretion that exaggerates the disease state, and it is suggested that dysfunctional glucagon is due to impaired intrinsic glucose regulation (Rorsman et al., 2008) and/or loss of the characteristic phasic relationship between insulin and glucagon secretion (Gylfe and Gilon, 2014).

The effect of the insulin hormone is counter-regulatory to glucagon. Hyperglycemic conditions stimulate the beta cells to release insulin into the blood stream. This promotes glucose uptake into target tissues, e.g., fat and muscle, resulting in decreased blood glucose levels. Insulin resistance therefore refers to a pathophysiological condition when tissues fail to respond to normal insulin levels, thereby triggering an adaptive compensatory response from the beta cells to release more insulin (Prentki and Nolan, 2006). While such compensatory beta cell adaptations may provide short-term relief, the long term consequence to the beta cells is deleterious: beginning with impaired insulin secretion capacity to outright beta cell failure as diabetes progresses.

Evidence so far point to combined effects of reduced betacell mass and impaired beta-cell function as primary drivers of T2D development (Meier and Bonadonna, 2013). Previous studies using immunohistochemical methods showed reduced beta cell mass in human T2D islets due to apoptosis (Butler et al., 2003), although lineage tracing of FoxO1-deficient beta cells in mice suggests that such beta cell mass reduction could also be due to dedifferentiation of the beta cells into alpha cells (Talchai et al., 2012). Our recent ultrastructural analyses of electromicroscopic images from human islet preparations however, indicate that beta cell mass is not significantly reduced in human T2D islets, implying that the pathogenesis lies primarily in impaired beta cell function, e.g., defective insulin production and/or secretion (Dayeh et al., 2014), which could be partly attributed to reduced expression of key beta cell-specific transcriptional regulators including MAFA, NKX6.1, and PDX1 (Guo et al., 2013). Indeed, many of the genes in the vicinity of single nucleotide polymorphisms (SNPs), which could be linked to predisposition to T2D, are usually involved in pancreatic beta cell functions (Groop and Lyssenko, 2009; Rosengren et al., 2012). Moreover, epigenetic changes in islets from T2D patients correlated with expression of genes involved in insulin secretion (Dayeh et al., 2014).

#### **COMPARATIVE ANATOMY OF HUMAN AND RODENT ISLETS**

A striking feature of isolated islets, regardless of species they are derived from, is their coherence into a compact cluster of cells. The earliest investigations were therefore focused in elucidating the types of endocrine cells constituting the islet, and whether the different islet cell types exhibit spatial organization. Here we concentrate on the main cytoarchitectural features of rodents (rat and mice) and human islets. A comprehensive treatise on islet comparative anatomy among different taxonomic groups from ancient fish to primates may be found elsewhere (Heller, 2010).

Compared with rodent islets, there is considerable heterogeneity in human islets in two levels: (i) islet cellular composition, proportion and organization, and (ii) distinction between small and large human islets, including differences in islet composition depending on their regional location in the pancreas (Brissova et al., 2005; Cabrera et al., 2006; Fujita et al., 2011; Wang et al., 2013). Mouse islets comprise up to 75% beta cells mainly in the core and surrounded by a mantel of ∼20% alpha cells, and ∼5% delta (somatostatin) cells (Brissova et al., 2005). In contrast, human islets have more scattered, random-like arrangement of the different islet cell types, with many beta cells also prominently located on the outer periphery (**Figure 1A**). On average, human islets contain ∼55% beta cells, ∼35% alpha cells, and ∼10% delta cells (Brissova et al., 2005). There are also other hormone-producing cells in the islets such as the PP (polypeptide) and ghrelin-producing epsilon cells identified in humans, rodents, and several mammals. Interestingly, only the adult human islets harbor a substantial number of ghrelin-producing epsilon cells (Wierup et al., 2014). During embryonic development in certain mammalian species, transient expression of serotonin-producing enterochromaffin cells (Alumets et al., 1983) and gastrin-producing G-cells have also been demonstrated in the pancreas (Suissa et al., 2013). Thus there are at least seven hormone-secreting endocrine cell types identified in the islets of different species at some point of pancreatic development (Wierup et al., 2014). It is worth mentioning that of all the studied model animals, only the non-human primates are found to have very similar islet cell distribution and organization as in humans (Brissova et al., 2005). This could be reflected by the high genetic similarities within the primate clade as shown in a comprehensive review on comparative genomics of human and more than a dozen non-human primates (Rogers and Gibbs, 2014).

The parasympathetic and sympathetic innervation patterns have also been shown to be very different in the human islets, with considerable implication in the autonomic control of hormone secretion. The human endocrine cells, as opposed to mouse endocrine cells, have fewer contacts with autonomic axons, which means less direct autonomic control of endocrine cell functions (Rodriguez-Diaz et al., 2011a). Instead smooth muscle cells of human islet blood vessels are found to be innervated with sympathetic fibers, with the implication that hormone secretion may be modulated by local blood flow (Rodriguez-Diaz et al., 2011a). Moreover the sparse cholinergic innervation within human islets appears to be compensated by the ability of the human alpha cells to secrete acetylcholine, which provides paracrine signal to the beta cells in response to impending increase in glucose concentration (Rodriguez-Diaz et al., 2011b).

While the many similarities between human and rodent islets have allowed the dissection of important shared biological processes in islet development and function, emerging findings on species-specific differences on pancreatic islet organization and composition, specifically between primates and other mammalian clades, highlight fundamental differences by which glucose homeostasis may be controlled. Indeed, the clinical and pathological features in non-human primate models of T2D impeccably reflect those of human T2D, making the translation of novel therapeutic agents from non-human primates to humans highly predictive (Hansen, 2012; Harwood et al., 2012). Corollary to such evidence, utmost caution must be exercised when extrapolating findings in rodent models to man when it comes to control of islet development, assessment of islet quality, insulin secretion capacity, and other techniques which may be confounded by the threedimensional arrangement of the different endocrine cell types in the islets (Brissova et al., 2005).

#### **FUNCTIONAL DIFFERENCES BETWEEN HUMAN AND RODENT ISLETS**

An obvious consequence of cytoarchitectural/morphological differences between human and rodent islets is differences in the relative accessibility of the different endocrine cell types from nutrient and paracrine signals. The unique innervation and vascularization patterns, combined with the apparent random organization of the various endocrine cells in the human islets allowfor more contact of the cells with the environment, and closer interactions between the different islet cell types resulting in more enhanced paracrine signaling (Cabrera et al., 2006). The effect of endogenous hormones secreted by the different islet cell types on one another is not trivial; ghrelin predominately inhibits insulin secretion (Wierup et al., 2014), somatostatin inhibits both insulin and glucagon release (Strowski et al., 2000), and both glucagon and insulin influence each other's secretion (Gromada et al., 2007; Bansal and Wang, 2008). Indeed, the significance of the interplay between functional alpha- and beta cell regulation in the pathogenesis of diabetes was highlighted in a recent review (Unger and Cherrington, 2012), suggesting a major role for dysfunctional glucagon release in the disease development.

Studies of whole islet physiology reveal subtle but important differences between murine and human islets. Even between mouse and rat islets, difference in insulin secretion in response to certain metabolites has been demonstrated. For instance, the absence of malic enzyme in mouse beta cells renders the cells unresponsive to dimethylsuccinate, in contrast to rat beta cells (MacDonald, 2002). While nutrient-induced insulin secretion was found to be globally similar between rodent and human islets, with the presence of both triggering and amplifying pathways, the concentration-response curve is shifted to the left in humans, compatible with the observation that humans have generally lower plasma glucose levels than rodents (Henquin et al., 2006).

The exact molecular mechanisms underlying discrepancies in different species regarding the response of islet cells on nutrient stimulation are not entirely known. However, differences in the major components of the stimulus-secretion coupling (**Figure 1B**) may provide important clues. In human beta cells, the main glucose transporters GLUT-1 and GLUT-3 have considerably higher affinity for the substrate as exemplified by their lower *K*<sup>m</sup> values, 6 and 1 mM, respectively than the main glucose transporter, Glut2, in rodents with *K*<sup>m</sup> of 11 mM (McCulloch et al., 2011; Rorsman and Braun, 2013). Another aspect of stimulus-secretion coupling is the electrically excitable nature of beta cells which highlights the importance of ion channels in generating action potentials leading to insulin secretion. The *K*ATP channel controls the electrical activity in both mouse and human beta cells, but the resting membrane conductance at 1 mM glucose is ten times lower in human beta cell, contributing to initiation of insulin secretion

at low glucose levels (5–6 mM; Rorsman and Braun, 2013). In addition, human beta cells are equipped with T-type Ca2<sup>+</sup> channels that are absent in mice. Together with current flow through voltage- dependent Na+ channel these channels contribute to the upstroke of the action potential in human beta-cells. (Barnett et al., 1995; Braun et al., 2008). The final increase in Ca2<sup>+</sup> triggering exocytosis of insulin granules arise from influx through L-type and P/Q-type Ca2<sup>+</sup> channels in rodent and human beta cells, respectively (Barg et al., 2001; Braun et al., 2009). The revelation of several electrophysiological differences between rodent and human beta cells (Rorsman and Braun, 2013), primarily due to differential expression, and/or transcriptional regulation by yet unknown principles [alternative splicing, microRNA (miRNA), lncRNAs], of functional ion channels underscores the importance of species-specific genetic basis of regulation of insulin secretion. More in depth characterization of human islets will further reveal such species differences in other aspects of molecular control of endocrine hormone secretion. Recently, analyses of anaplerotic products show much lower dependence of human islets to the activity of the key anaplerotic enzyme, pyruvate carboxylase (MacDonald et al., 2011). Finally, in the search for exocytotic genes differentially expressed in human T2D islets, synaptotagmin isoforms previously deemed unimportant in mice beta cell exocytosis, correlated with insulin secretion in humans (Andersson et al., 2012).

To understand the origin of morphological and functional differences in rodent and human pancreatic islets, one has to consider the developmental aspect of endocrine cell differentiation (**Figure 1A**). Conventional wisdom dictates that regardless of species, the path from endoderm progenitors to differentiated hormone-secreting cells of the pancreatic islets is a highly orchestrated process mediated by precise spatio-temporal interplay of various transcriptionfactors. However, significant differences exist in the transcriptional repertoire of endocrine cell differentiation between human and mouse. A survey of known regulators of mouse endocrine pancreatic cell fate in purified human beta and alpha cells reveals that *MAFB* which is not expressed in adult mouse beta cells, is present in comparable levels in both human alpha and beta cells (Dorrell et al., 2011). In the same study, *IRX2* previously shown to be expressed only in the developing mouse pancreas (Petri et al., 2006), was found to persist in adult human alpha cells. Recently, the pancreas-enriched miR-7 was also found to negatively regulate Pax6 which has a central role in endocrine cell differentiation and maintenance of identity (Kredo-Russo et al., 2012). Although miR-7 is a broadly conserved miRNA, it is possible that it may also target other non-conserved mRNAs which may impart species-specific fine-tuning of regulatory circuits in the context of islet development. Indeed, both evolutionary conserved and non-conserved targets for individual miRNAs have been predicted and demonstrated (Betel et al., 2010). The involvement of non-coding RNAs in pancreatic islet development adds another level of regulation of cell fate trajectories and distinct cell-type specific functions.

#### **LONG NON-CODING RNAs**

Long non-coding RNAs are transcripts without protein-coding potential, arbitrarily defined in size by a cut-off length of >200 nucleotides (HUGO Gene Nomenclature Committee; Seal et al., 2011). Most lncRNAs are transcribed by RNA polymerase II, and share many properties of mRNAs such as splicing, capping and polyadenylation (Derrien et al., 2012). Similar to proteincoding genes, the expression of lncRNAs is tightly regulated and display spatio-temporal expression patterns, i.e., cell-type specific and/or developmental stage-specific expression (Dinger et al., 2008; Mercer et al., 2008; Cabili et al., 2011).

Integrative analysis of RNA-seq data with other complementary high-throughput "transcript annotation" technologies, e.g., transcription initiation mapping by cap-analysis of gene expression (CAGE; Kodzius et al., 2006) and identification of sites of 5 and 3 transcript termini (Ng et al., 2005), reveals that lncRNAs may generally be categorized with respect to their genomic position either as "intergenic" (between protein-coding genes), or "genic" (Derrien et al., 2012). Intergenic lncRNAs or "lincRNAs" (long intergenic non-coding RNAs) are encoded as distinct transcriptional units within genomic regions which used to be called "gene deserts." The "genic" lncRNAs may be exonic, intronic, or overlapping, and can be further classified as either in the sense or antisense strand relative to the protein-coding gene (Derrien et al., 2012). An in-depth investigation on expression dynamics of lncRNAs during differentiation of human neuroblastoma cells suggests 19 different genomic architecture classes of lncRNAs based on both their relative positions with protein-coding genes, and on the orientations of their transcription (Batagov et al., 2013).

The GENCODE (encyclopædia of genes and gene variants) project lists 13870 lncRNA genes in the human genome (Version 19, July 2013 freeze, GRCh37 – Ensembl 74) and 4074 lncRNA genes in the mouse genome (Version M2, July 2013 freeze, GRCm38 – Ensembl 74; http://www.gencodegenes.org/; Dunham et al., 2012). There are other independent efforts in annotating lncRNAs in the human genome, albeit with surprisingly low overlap with the GENCODE annotations. For example, only 39% of the 4662 human lincRNA loci cataloged inCabili et al. (2011)study intersected with those of GENCODE's human lncRNAs (Derrien et al., 2012). Thus, while there is an indisputable consensus about the widespread transcription of lncRNA genes in human and other mammalian cells, the field is still mostly in the exploratory stage, and both high-throughput biochemical data generation and *in silico* analyses warrant further development to aid in standardization of analytical procedures.

#### **ROLES OF lncRNA**

A wide variety of functions have been attributed to lncRNAs including roles in transcriptional regulation (Penny et al., 1996; Orom et al., 2010; Santoro et al., 2013), as architectural determinants of subcellular structures (Clemson et al., 2009; Batista and Chang, 2013), in epigenetic inheritance/chromatin dynamics (Khalil et al., 2009; Tsai et al., 2010), and most recently in higherorder chromosomal organization (Hacisuleyman et al., 2014). Here we will only briefly mention the general characteristics of lncRNA mechanisms of action. A more detailed description of some well-characterized lncRNAs is described elsewhere in recent reviews (Kornfeld and Bruning, 2014; Yang et al., 2014a).

As entities of transcriptional control, it is emerging that lncRNAs may perform their functions in at least two ways: (i) as scaffoldings in ribonucleoprotein complexes, e.g., transcription or chromatin-modifying factors, acting in *cis* or in *trans* on the genome (Yang et al., 2014a), and (ii) as incidental by-products of a negative type of transcriptional regulation termed, "transcriptional interference" (Kornienko et al., 2013). Prior knowledge about the mechanism of action of a few well-characterized lncRNAs such as the HOTAIR (Rinn et al., 2007), and the observation that as many as 24% of human lincRNAs bind the polycomb repressive complex 2 (PRC2), responsible for transcriptional repression of specific genes by methylation of H3K27 (Khalil et al., 2009) strengthened the hypothesis that lncRNAs provide the specificity in guiding chromatin-modifying complexes into exact regions in the genome. Whether the guiding mechanism is based on sequencecomplementarity between the lncRNA and genomic DNA, or other motif-recognition process remain to be seen (Guttman and Rinn, 2012). In a separate attempt to functionally categorize lncRNAs *en masse*, it was shown that many lncRNAs have enhancer-like properties, activating and/or potentiating the expression of neighboring protein-coding genes (Orom et al., 2010).

Recently, it was shown that a lncRNA called "functional intergenic repeating RNA element" (*Firre*), facilitates transchromosomal interactions, in which genes involved in energy metabolism and adipogenesis are brought in close proximity, presumably to allow efficient co-regulation of genes in the same biological pathway (Hacisuleyman et al., 2014).

In summary, lncRNAs characterized thus far display distinct functions and mode of regulation, again reminiscent of protein-coding genes belonging to broad functional ontologies. Nevertheless, as exemplified by PRC2-binding lncRNAs, or the enhancer-like lncRNAs, consensus mechanisms of action are emerging for many lncRNAs which may permit their classification into specific functional categories.

#### **EVOLUTIONARY CONSERVATION OF lncRNAs**

Previous experience from large-scale, computational prediction of protein-coding genes as crucial first step in de novo genome annotation, showed that DNA sequence conservation across broad phyla is a good indicator of genuine coding potential, i.e., that the gene will produce an mRNA transcript and will eventually be translated, and hence has biologically meaningful functions. Although lncRNAs have been experimentally discovered from transcription data, it is still deemed necessary to use the criterion of evolutionary conservation to help distinguish functional RNA transcripts from transcriptional and experimental noise. However, sequence conservation alone is inherently problematic for non-coding RNA genes whose functional gene products act in the level of secondary or tertiary structural RNA features. Formation of secondary RNA structures are not evolutionary constrained to maintain nucleotide sequences in the same way that protein-coding genes are constrained to maintain specific codon sequences to ensurefunctional proteins. For instance, RNA hairpin loops which are ubiquitous secondary structural feature of virtually all functional non-coding RNA molecules may be formed irrespective of the nucleotide sequence, as long as energetically favorable base-pairings in the hairpin stem are maintained. It

is perhaps not surprising that when standard sequence alignment procedures are used to assess the conservation of lncRNAs in various species, consistently modest sequence similarity is found. Indeed, compared to protein-coding sequences aligned between different species, much lower sequence identities are found for each of the 993 syntenically paired orthologous lincRNAs in mammals and other vertebrates (Cabili et al., 2011). However, the promoter regions of the lncRNAs are shown to be more conserved than the exonic regions which imply similar regulation and potentially analogous roles of orthologous lncRNAs in different species (Guttman et al., 2009; Derrien et al., 2012).

Remarkably, despite employing meticulous approaches in aligning genomes, one study finds only 12% (993 lincRNAs) of human lincRNAs with orthologous sequences in another vertebrate species (Cabili et al.,2011), while GENCODE v.7 reports 30% of all annotated lncRNAs, ∼4500 lncRNAs, to be clearly primatespecific (Derrien et al., 2012). The manually curated lncRNA database (www.lncrnadb.org) lists a number of primate-specific lncRNAs (Amaral et al., 2011), including the 482-nucleotide long HULC RNA found to be highly upregulated in human hepatocellular carcinoma (Panzitt et al., 2007). One recently investigated human-specific non-coding RNA, miR-941, is expressed in the brain and regulates genes involved in neurotransmitter signaling (Hu et al., 2012). Thus, conservation of gene locus, let alone gene sequence across broad phyla is not a strict requirement for biological function. After all, it is the evolution of genetic differences that ultimately drives speciation. The non-conserved lncRNAs must be bona fide elements of genetic programs which specify phenotypic/morphological differences between organisms.

# **EXPRESSION OF lncRNAs IN HUMAN PANCREATIC ISLET CELLS**

By integrating transcriptomics data and chromatin maps, 1128 lncRNAs were reliably identified in purified human pancreatic islets (Moran et al., 2012). Many of the lncRNAs were specifically expressed in the islets and beta cells, suggesting important roles in the developmental programming, proper functioning and/or maintenance of the pancreatic endocrine tissue. Indeed, the expression levels of a dozen lncRNAs were found to fluctuate during stage-specific embryonic stem cell differentiation relative to the final expression in *in vivo* functional endocrine cells, and at least two lncRNAs, HI-LNC78 and HI-LNC80, exhibited dynamic upregulation when the islets were exposed to high glucose concentrations (Moran et al., 2012). In a separate deep RNA-seq study of purified human beta cells, 148 lincRNAs were found to be overexpressed in beta cells compared to non-beta cells (Nica et al., 2013), while another study discovered 12 beta cell-specific and 5 alpha cell-specific lncRNAs (Bramswig et al., 2013). Taken together, these findings suggest the importance of cell-type- and/or condition-specific expression of lncRNAs in the human pancreatic islet.

All the aforementioned studies on lncRNAs in the human pancreatic islets are exploratory in nature, and no particular mechanism of action has been attributed so far on any of the identified lncRNAs in the islet cells. Elucidating the molecular functions of islet cell-specific or – enriched lncRNAs will be challenging because of the generally low conservation of lncRNAs in commonly used rodent and *in vitro* models. Indeed, RNA-seq of mouse pancreatic beta cell transcriptome corroborates previous findings about the very weak conservation of lincRNAs in humans (Ku et al., 2012), although it was shown that for a considerable number of lincRNAs, short conserved stretches of sequences may be enough to guarantee conserved function in vertebrate embryonic development (Ulitsky et al., 2011). In view of species-specific expression of many lncRNAs, the recently developed human beta cell lines from Ravassard et al. (2011), EndoC-BH1 and EndoC-BH2 (Scharfmann et al., 2014) will be valuable in dissecting the molecular functions of lncRNAs in the beta cell.

# **POTENTIAL ROLES OF lncRNAs IN THE DEVELOPMENT, FUNCTION AND MAINTENANCE OF PANCREATIC ISLET CELLS: ACQUISITION OF SPECIES-SPECIFIC ISLET PHENOTYPE/MORPHOLOGY AND CELL-TYPE SPECIFIC FUNCTIONS**

In light of scarce experimental data on recently discovered lncR-NAs in islet cells, we can only infer about their potential molecular roles based on findings in other cell types. The various mechanisms by which lncRNAs were shown to exert their functions will undoubtedly also influence pancreatic islet cells in terms of cellular differentiation and development, specifically in maintaining cellular identity and plasticity. They may also be important component of islet cells stress response, such as in activating beta cell compensatory mechanisms in countering environmental stressors in T2D. The role of lncRNAs in the acquisition of species-specific islet phenotype/morphology, and maintenance of cellular phenotype of pancreatic islet cells may be operational in two levels: (i) between islet cell types of different species, and (ii) among the islet cells of the same species.

Almost a third of human lncRNAs (∼330 lncRNAs) discovered in the human islets lack orthologous sequences in mice (Moran et al., 2012). Given the tendency of many lncRNAs to associate with components of chromatin-modifying complexes with roles in embryonic stem cell fates (Dinger et al., 2008;Yang et al., 2014b), the involvement of primate-specific lncRNA in the developmental program of pancreatic islets is possible, and may (i) have direct contribution to the origin of cytoarchitectural differences between islets in the different species (**Figure 1A**), and (ii) contribute to the differential expression of essential proteins in islet cell secretion (**Figure 1B**). In the same line of reasoning, a number of islet celltype specific lncRNAs recently reported (12 beta cell-specific and 5 alpha cell-specific; Bramswig et al., 2013) could play essential roles in conferring cell-type specific functions in the pancreatic islets.

The islet cells are constantly subjected to fluctuating nutrient stimuli and are challenged to respond accordingly to maintain glucose homeostasis. The coordinated transcription of many genes required to overcome this challenge relies on transcription factors activating specific sets of genes. Indeed, rat islets subjected to different glucose concentrations showed distinct clusters of mRNA profiles suggesting highly coordinated response to varying nutrient stimuli (Bensellam et al., 2009). It will be interesting whether *Firre-*like lncRNAs involved in trans-chromosomal interactions

(Hacisuleyman et al., 2014) are present in endocrine cells to facilitate regulation of cell-type specific pathways by acting as scaffolds guiding transcription factors to target genes. Many of the identified human islet lncRNAs lie adjacent to islet-specific chromatin domains and protein-coding genes (Moran et al., 2012), a striking example being*HI-LNC25* whose closest neighbor is *MAFB* which is a regulator of islet-cell maturation (Artner et al., 2007) and as was discussed earlier absent in adult mouse beta cells, but present in both human alpha and beta cells (Dorrell et al., 2011).

Enhancer elements are also key determinants of islet-specific gene activity (Pasquali et al., 2014). Notably, Cabili et al. (2011) report that 27% (∼1200) of human lincRNAs overlap with known enhancer regions in the genome. It will be interesting to examine the presence of lncRNAs in human pancreatic islet enhancer clusters reported to be enriched in T2D risk-associated variants (Pasquali et al., 2014).

Extensive mapping of the epigenome of whole human pancreatic islets (Bhandare et al., 2010; Gaulton et al., 2010; Stitzel et al., 2010; Dayeh et al., 2014), and of purified islet cells (Bramswig et al., 2013; Dayeh et al., 2014), reveal cell-type specific epigenetic landscape delineating sites of active/inactive gene transcription. The majority of lncRNAs shown to associate with various chromatin-modifying factors are tantalizing candidate factors which could provide additional molecular specificity in targeting epigenetic markers in the pancreatic islet cells. It is tempting to hypothesize that primate-specific lncRNAs could be involved in specifying the expression of certain components of the stimulussecretion coupling in the beta cells, which in turn potentially contributes in species-specific response of beta cells to nutrient stimuli (**Figure 1B**).

# **CONCLUSION**

The pervasive nature of eukaryotic gene transcription revealed by next-generation sequencing and associated technologies for mapping transcriptional activity brought into the limelight a plethora of non-coding RNA classes of hitherto unknown functions. The existence of long functional RNA molecules provides tantalizing hypotheses on how chromatin-modifying and transcription factors may act upon their genomic targets in a highly loci-specific recognition process. This provides an important mechanistic insight into how specificity is achieved by various factors in pancreatic islets cells to coordinate the regulation of multiple genes responsible for cell-type specific phenotypes and functions.

The potential roles of lncRNAs in pancreatic islet development, specifically in endocrine cell-fate determination and subsequent maintenance of cellular identities and functions, will broaden our understanding of pancreatic islets, and hence open up new possibilities in identifying novel therapeutic targets in treating type-2 diabetes (T2D). However, the implication of numerous primatespecific lncRNAs will be a particular challenge in the field which heavily depends on rodent models when trying to elucidate the molecular basis of metabolic pathophysiologies. In particular, considering the complex ethical issues involved, the *in vivo* roles of primate-specific lncRNAs will pose nagging questions in many years to come.

#### **ACKNOWLEDGMENTS**

We thank our colleagues at LUDC (Lund University Diabetes Centre) and Exodiab (Excellence of Diabetes Research in Sweden) for the collaborative research atmosphere. Lena Eliasson is a senior researcher at the Swedish Research council and Jonathan L. S. Esguerra is supported by an FP7 EU-grant to BetaBat and EFSD/Lilly Research fellowship. We are grateful for support to our research in this area from the Swedish Research Council, Diabetesfonden, Diabetic wellness foundation, Albert Påhlsson foundation, the Crafoord foundation, Knut, Alice Wallenberg foundation, and The Bo & Kerstin Hjelt Diabetes Foundation.

### **REFERENCES**


by laser scanning confocal microscopy. *J. Histochem. Cytochem.* 53, 1087–1097. doi: 10.1369/jhc.5C6684.2005


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2014; accepted: 19 June 2014; published online: 07 July 2014. Citation: Esguerra JLS and Eliasson L (2014) Functional implications of long noncoding RNAs in the pancreatic islets of Langerhans. Front. Genet. 5:209. doi: 10.3389/fgene.2014.00209*

*This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Esguerra and Eliasson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The Missing lnc(RNA) between the pancreatic **β**-cell and diabetes

# *Vasumathi Kameswaran and Klaus H. Kaestner\**

Department of Genetics and Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA

#### *Edited by:*

Romano Regazzi, University of Lausanne, Switzerland

#### *Reviewed by:*

Thomas Mandrup-Poulsen, University of Copenhagen, Denmark Anna Motterle, University of Lausanne, Switzerland

#### *\*Correspondence:*

Klaus H. Kaestner, Department of Genetics and Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Boulevard, Philadelphia, PA 19104, USA e-mail: kaestner@mail.med. upenn.edu

# **INTRODUCTION**

Recent technological advances in the field of genome sequencing have paved the way for a new appreciation of non-coding RNAs in gene regulation. Ultra high-throughput transcriptome analyses have revealed that the vast majority of the genome is transcribed, with two-thirds of the human genome covered by processed transcripts, of which only a small fraction (<2%) is translated into proteins (Djebali et al., 2012). The identification of several common genomic and functional features of these untranslated RNAs has led to their categorization into various classes of non-coding RNAs. One such class that has been the focus of extensive research is that of long non-coding RNAs (lncRNAs).

LncRNAs are defined as transcripts longer than 200 bp that lack protein-coding potential (Guttman et al., 2009; Derrien et al., 2012; Batista and Chang, 2013; Fatica and Bozzoni, 2014). Like messenger RNAs, lncRNAs typically have multiple exons, are processed using canonical splice sites, and may exist as several isoforms (Ponjavic et al., 2007; Cabili et al., 2011; Derrien et al., 2012). In contrast to mRNAs, lncRNAs preferentially display nuclear localization, consistent with their proposed function in chromatin organization and regulation of gene expression (Khalil et al., 2009; Zhao et al., 2010; Derrien et al., 2012; Guttman and Rinn, 2012; Rinn and Chang, 2012; Fatica and Bozzoni, 2014).

Similar to protein-coding genes, lncRNA-encoding genes are marked by chromatin signatures typical of active transcription in the cell types where they are expressed, consisting of H3K4me3 (trimethylated lysine 4 in histone H3) at the promoter, followed by H3K36me3 along the transcribed regions (so-called "K4–K36 domains"; Guttman et al., 2009; Khalil et al., 2009; Cabili et al., 2011; Guttman and Rinn, 2012; Rinn and Chang, 2012). While lncRNA exons display weaker evolutionary conservation than

Diabetes mellitus represents a group of complex metabolic diseases that result in impaired glucose homeostasis, which includes destruction of β-cells or the failure of these insulinsecreting cells to compensate for increased metabolic demand. Despite a strong interest in characterizing the transcriptome of the different human islet cell types to understand the molecular basis of diabetes, very little attention has been paid to the role of long non-coding RNAs (lncRNAs) and their contribution to this disease. Here we summarize the growing evidence for the potential role of these lncRNAs in β-cell function and dysregulation in diabetes, with a focus on imprinted genomic loci.

**Keywords: lncRNA, β-cell biology, diabetes mellitus, imprinting control region (ICR),** *MEG3*

those of protein-coding genes, there is evidence of positive selection for a subset of lncRNAs, which may be driven by constraints to maintain secondary structure required for functional interactions with their targets (Ponjavic et al., 2007; Guttman et al., 2009; Cabili et al.,2011; Ulitsky et al.,2011; Derrien et al.,2012). In contrast, the promoters of lncRNAs are as highly conserved as those of proteincoding genes (Carninci et al., 2005; Ponjavic et al., 2007; Guttman et al., 2009; Derrien et al., 2012; Batista and Chang, 2013). Despite their overall lower expression levels, lncRNAs exhibit a higher degree of tissue specificity compared to average protein-coding genes (Mercer et al., 2008; Cabili et al., 2011; Derrien et al., 2012; Batista and Chang, 2013; Fatica and Bozzoni, 2014).

Through numerous studies, several general principles of lncRNA function have emerged. LncRNAs have been shown to function both in *cis,* i.e., locally close to the site of their production, and in *trans*, i.e., at sites on other chromosomes. LncRNAs have been proposed to act as scaffolds for chromatin modifiers, blockers of transcription, antisense RNAs, microRNA sponges, protein decoys, and enhancers (Cech and Steitz, 2014; Fatica and Bozzoni, 2014). In fact, the act of transcription of a lncRNA itself can interfere with the regulatory function of a regulatory DNA sequence, as exemplified in yeast (Martens et al., 2004) and in mammalian imprinting (Latos et al., 2012). As a result of their diverse functions in multiple tissues, mis-regulation of lncRNAs can lead to failure of normal development and, consequently, to disease. Mammalian chromatin modifiers such as the repressive *polycomb* complexes often lack their own specific DNA-binding domains but instead contain RNA-binding elements. LncRNAs can play critical roles in directing these repressive chromatin modifying complexes to their target regions (Bernstein and Allis, 2005; Rinn et al., 2007; Zhao et al., 2010). One such example is the Foxf1-adjacent, noncoding developmental regulatory RNA (*Fendrr*), a lncRNA that

interacts with the polycomb repressive complex 2 (PRC2) and is critical for heart development and function (Grote et al., 2013). Similarly, the well-characterized *HOTAIR* lncRNA, which is transcribed from the *HOXC* locus, is highly upregulated in primary breast tumors and was shown to function through the silencing of tumor suppressor genes in a PRC2-dependent manner [Gupta et al., 2010; See Maass et al. (2014) for a list of lncRNAs currently implicated in human diseases]. Taken together, these features suggest that lncRNAs and other non-coding RNA species may play an essential role in defining organismal complexity (Mattick and Makunin, 2006; Taft et al., 2007).

These findings raise the possibility that lncRNAs and other noncoding RNAs may be exciting molecular candidates to account for the unresolved genetic risk in complex diseases such as diabetes (Medici et al., 1999; Hyttinen et al., 2003). Diabetes mellitus represents a group of metabolic diseases that result in impaired glucose homeostasis. In the case of type 1 diabetes (T1D), metabolic impairment is the result of autoimmune destruction of insulinproducing pancreatic β-cells. In type 2 diabetes (T2D), the most prevalent form of the disease, the defect in glucose metabolism is the result of decreased sensitivity of peripheral tissues to insulin action, accompanied by failure of β-cells to compensate for the increased metabolic demand (Zimmet et al., 2001). Together, these diseases affect over 25 million Americans and account for \$176 billion in healthcare costs per year in the US alone (Association, 2013), necessitating the pursuit of more effective and personalized treatments.

Significant efforts have been made to attain a better understanding of the causes of diabetes at the molecular level. Linkage analysis of affected families led to the successful identification of causal gene mutation in several rare, Mendelian forms of the disease (Fajans et al., 2001; O'Rahilly, 2009). However, large-scale efforts to identify DNA variants associated with more common forms of diabetes through genome-wide association studies (GWAS) have predominantly identified candidate variants that lie in noncoding regions and with as yet unknown functions (McCarthy, 2010). Thus, to improve our current understanding of the molecular basis of diabetes mellitus and to develop better treatment strategies, we need to carefully characterize the transcriptome of pancreatic β-cells, with a focus on elucidating the functions of non-coding transcripts. In this review, we present a summary of recent evidence for a role of lncRNAs in the regulation of β-cell function and their potential contribution to the pathogenesis of diabetes.

#### **β-CELL lncRNAs**

The most comprehensive catalog of human lncRNAs expressed in β-cells published thus far is that by Morán et al. (2012). In this study, the authors profiled whole islet and FACS-sorted β-cells and identified 1,128 distinct transcripts that displayed many of the typical properties of lncRNAs described above, including the "K4–K36" histone modification domains, lack of protein-coding potential, and non-uniform expression levels among tissues. Most notably, the lncRNAs identified were roughly five times more islet-specific compared to general protein-coding genes, and the vast majority had orthologous genes in the mouse genome. Ku et al. (2012)similarly characterized mouse islet- and β-cell-specific transcripts and identified 1,359 high-confidence lncRNAs with several of the aforementioned properties. Using high-throughput transcriptome analysis of sorted human islets, lncRNAs expressed in α-cells have also been identified (Bramswig et al., 2013).

Of particular interest was the fact that lncRNAs were often found in proximity to critical islet-specific transcription factors (Ku et al., 2012; Morán et al., 2012). Thus, protein-coding genes adjacent to islet-enriched lncRNAs were also more likely to be isletspecific than the average protein-coding gene (Morán et al., 2012). This correlation has led to the suggestion that lncRNAs and nearby protein-coding genes share common regulatory elements. Indeed, lncRNAs were often found in large regions of open chromatin that were uniquely associated with protein-coding genes expressed highly in islets (Gaulton et al., 2010).

The temporal expression of islet lncRNAs has also been studied by Morán et al. (2012) in human embryonic pancreases as well as in a stepwise *in vitro* β-cell differentiation model using human embryonic stem (ES) cells (developed by Kroon et al., 2008). Unlike some lncRNAs that are known to be critical to early stages of embryonic development (Guttman et al., 2011; Grote et al., 2013), the expression of a majority of islet lncRNAs identified in this study (Morán et al., 2012) is restricted to differentiated, mature endocrine cells. The orthologous mouse lncRNAs (e.g., *Mi-Lnc80*) exhibit similar cell- and stage-specific expression.

The characteristics of these islet lncRNAs imply a role for these RNAs in mature β-cell function. To test this hypothesis, Morán et al. (2012) used short hairpin RNAs (shRNAs) to suppress the activity of one such lncRNA transcript in the human EndoC-βH1 β-cell line (Ravassard et al., 2011). From a panel of known isletspecific transcripts, the authors identified *GLIS3* as a downstream target of *HI-LNC25*, a lncRNA that shares a regulatory domain with *MAFB*. Variants at the *GLIS3* locus are associated with different risks for T1D (Barrett et al., 2009), elevated fasting glucose levels (Dupuis et al., 2010), as well as T2D (Cho et al., 2012). Lossof function studies suggest that *GLIS3* encodes a transcription factor critical for regulating the expression of insulin and several key islet-transcription factors, and may confer risk for both T1D and T2D by resulting in diminished β-cell numbers and by promoting the formation of a pro-apoptotic splice variant of the protein *Bim* (Kang et al., 2009; Nogueira et al., 2013; ZeRuth et al., 2013). However, the shRNA-mediated decrease in *GLIS3* mRNA levels had no impact on glucose-stimulated insulin secretion or insulin transcript levels in the transduced EndoC-βH1 β-cell line, possibly because this cell line does not recapitulate all aspects of β-cell function *in vivo*. Additionally, only a minor fraction of β-cell expressed lncRNAs was responsive to elevated glucose levels in human islets.

As previously noted, several risk variants for common forms of diabetes identified by GWAS do not change the protein-coding potential of known genes, suggesting that they might affect as yet unidentified regulatory elements (McCarthy, 2010). Using a computational tool known as MAGENTA to search for enrichment of genetic associations in a predefined set of genes (Segrè et al., 2010), Morán et al. (2012) determined that the islet lncRNA genes identified in their study were in fact highly enriched for risk alleles associated with T2D and related phenotypes, further underscoring the need to interrogate thefunction of these RNAs in β-cell biology.

Overall, these studies highlighted lncRNAs as a major component of the β-cell transcriptome that is cell-type-specific, developmentally regulated, and evolutionarily conserved with strong associations to disease risk. However, it still remains to be determined how these lncRNAs may contribute to β-cell function, and if their mis-regulation may play a role in diabetes. Their expression in EndoC-βH1 cells and mouse islets provides additional platforms to evaluate their function in a systematic and comprehensive manner. Future studies will also need to address the question of whether the lncRNAs identified thus far act in *cis* (on neighboring islet protein-coding genes) or in *trans* to exert their function.

### **IMPRINTING**

Some of the best characterized lncRNAs to date were first uncovered in early studies of imprinting and dosage compensation of the X-chromosome (Brannan et al., 1990; Brown et al., 1991; Fatica and Bozzoni, 2014). Imprinting refers to the biased expression of genes depending on the parental origin of the chromosome. This process is tightly regulated, typically through epigenetic modifications such as DNA methylation at *cis*-acting elements known as "imprinting control regions" (ICRs), to establish and maintain mono-allelic expression of specific genes (Thorvaldsen and Bartolomei, 2007). Methylation at the ICRs is maintained despite active demethylation and dynamic reprogramming in the newly formed zygote, and is only altered during establishment of methylation pattern in a sex-specific manner during primordial germ cell development (Bartolomei and Ferguson-Smith, 2011). Imprinted loci are generally found in large clusters, where both maternallyand paternally expressed genes are interspersed. Frequently, the protein-coding genes are expressed from one parental allele, while non-coding genes are expressed from the other (Barlow, 2011). LncRNAs play an essential role in the regulation of mono-allelic expression, either by acting in *cis* as an antisense molecule to block the transcriptional machinery, or by directly recruiting repressive chromatin modifiers to silence reciprocally expressed genes (Lee and Bartolomei, 2013).

While imprinting is most extensively studied in the context of fetal development, tissue-specific regulation in adult tissues has also been observed (Barlow, 2011; Lee and Bartolomei, 2013). As a result, several imprinted genes are also implicated in human diseases that arise from somatic tissues. One such example is that of the maternally expressed adipose tissue transcription factor, *KLF14* (Parker-Katiraee et al., 2007), which is associated with risk for both T2D and high-density lipoprotein disorders (Teslovich et al., 2010; Voight et al., 2010; Small et al., 2011). Perhaps the functionally haploid nature of these loci results in their increased likelihood to be associated with susceptibility to disease, as mutations in these genes, when found on the maternal chromosome that is expressed, cannot be "covered" by the gene from the other, silenced paternal allele. This may be particularly true for metabolic disorders, as several imprinted genes encode dosagesensitive proteins related to growth factors and energy metabolism. Interestingly, several risk variants for type 1 and type 2 diabetes identified through GWAS are located in imprinted loci including *KCNQ1*, *MEG3, PLAGL1,* and *GRB10*. A few of these are discussed below in the context of islet and β-cell function.

#### *DLK1–MEG3* **LOCUS**

Recently, we identified the maternally expressed non-coding RNAs of the imprinted *DLK1*–*MEG3* locus as down-regulated in human islets from T2D donors (Kameswaran et al., 2014). This gene cluster is located on human 14q32 (mouse chromosome 12) and contains three paternally expressed protein-coding genes, *DLK1, RTL1,* and *DIO3*. *DLK1* is a non-canonical Notch ligand that is expressed in many embryonic tissues (Falix et al., 2012) and is a well-established negative regulator of adipocyte differentiation (Smas and Sul, 1993; Mitterberger et al., 2012; Abdallah et al., 2013). *DLK1* is highly expressed in human and mouse βcells (Tornehave et al., 1996; Dorrell et al., 2011; Appelbe et al., 2013). While *DLK1* was demonstrated to be stimulated by growth hormone and prolactin expression in rat islets, including during pregnancy, it is not directly responsible for the mitogenic effects of these hormones on islets (Carlsson et al., 1997; Friedrichsen et al., 2003). Additionally, loss of expression of *Dlk1* in unchallenged mouse β-cells does not cause any observable phenotype (Appelbe et al., 2013). *Rtl1* (*Retrotransposon-like 1*) is critical for normal placental development and its loss results in severe developmental defects and late-fetal lethality (Sekita et al., 2008).

The maternally expressed genes are all non-coding RNAs, consisting of the lncRNA, *Maternally Expressed Gene 3* (*MEG3*, known as *Gtl2* in mice), as well as a large cluster of microRNAs (miRNAs) and snoRNAs (Schmidt et al., 2000; Seitz et al., 2004; da Rocha et al., 2008). In several tissues, including human islets, the noncoding RNAs are all derived from a single transcript that initiates from the *MEG3* promoter (Tierling et al., 2006; da Rocha et al., 2008; Kameswaran et al., 2014).

Reciprocal imprinting is established by methylation of two differentially methylated regions (DMRs) on the paternal allele, one located ∼13 kb upstream of the *MEG3* transcription start site (IG-DMR), and the other overlapping with the promoter of the *MEG3* poly cistronic transcript (*MEG3*-DMR; **Figure 1**). While the IG-DMR is the primary ICR for this imprinted cluster, the *MEG3*-DMR is also critical to regulating and maintaining imprinting at this region (Kagami et al., 2010). Failure to maintain imprinting at this locus can lead to either maternal or paternal uniparental disomy (UPD) of chromosome 14, which causes distinct and severe developmental disorders (Kagami et al., 2008).

Increased methylation of the *MEG3*-DMR and related loss of *MEG3* expression has been observed in several human cancers, such as pituitary and renal cell cancers and multiple myeloma (Zhao et al., 2005; Kawakami et al., 2006; Benetatos et al., 2008) to name a few (further reviewed by Benetatos et al., 2011). These studies, coupled with *in vitro* experiments, suggest that *MEG3* functions as a tumor suppressor by activating p53, in a manner dependent upon the secondary structure of the *MEG3* RNA (Zhou et al., 2007, 2012). Furthermore, decreased expression of *MEG3* and hypermethylation of the DMRs may single-handedly explain the subtle phenotypic differences between induced pluripotent stem cells (iPSCs) and ES cells, such as the decreased efficiency in generating chimeric mice from iPSCs (Stadtfeld et al., 2010).

Similar to the aforementioned examples, decreased expression of *MEG3* and the associated miRNAs in T2D islets strongly correlates with hypermethylation of the *MEG3*-DMR (Kameswaran

et al., 2014). Additionally, a single nucleotide polymorphism (SNP) (rs941576) located in an intron of *MEG3* was found to be associated with T1D, with the risk allele being transmitted more frequentlyfrom thefather than the mother of the affected offspring (Wallace et al., 2010). Overall, these examples provide compelling evidence for the importance of *MEG3* and the regulation of this imprinted region in several diseases. Despite the strong disease association of this lncRNA, and thefact that genes in this imprinted cluster are very highly expressed in human β-cells (Dorrell et al., 2011; Kameswaran et al., 2014), there are currently no postulated mechanisms for its potential role in β-cell function and diabetes pathogenesis.

Recent studies have suggested that similar to other nuclear lncR-NAs, *MEG3* also directly interacts with the PRC2 complex in ES cells to guide the repressive histone modification mark H3K27me3 to its target sites (Zhao et al., 2010; Kaneko et al., 2014). One study identified *Dlk1* as a direct target of the *Meg3*-PRC2 complex in mouse ES cells (**Figure 1**), although this finding could not be replicated in *MEG3*-expressing human iPSCs, where *MEG3* was found to function in *trans* (Zhao et al., 2010; Kaneko et al., 2014). A careful characterization of *MEG3-*PRC2 complex targets in adult pancreatic islets will provide better insights into the role of this lncRNA in β-cell function.

#### *KCNQ1* **LOCUS**

The *KCNQ1* gene, encoding a voltage-gated potassium channel, has been of great interest to the β-cell biology field due to its strong disease association. The gene is located in an imprinted locus on human 11p15.5, adjacent to another independently regulated imprinted locus, *H19–IGF2*. This region was implicated as a molecular candidate for Beckwith–Wiedemann syndrome (BWS), a disorder characterized by prenatal macrosomia, predisposition for tumor development and frequently, hyperinsulinemic hypoglycemia (Lee et al., 1997, 1999; Hussain et al., 2005). This imprinted region consists of several conserved, maternally expressed protein-coding genes, such as the cell cycle inhibitor *CDKN1C,* and a paternally expressed antisense lncRNA, *KCNQ1* overlapping transcript1 (*KCNQ1OT1*; Monk et al., 2006). Loss of imprinting in this locus can lead to the suppression of *CDKN1C*, which is sufficient to cause reentry of adult human β-cells into the cell cycle (Avrahami et al., 2014).

Imprinting of this region is maintained by a maternally methylated ICR, known as the KvDMR, which is also the promoter for *KCNQ1OT1* (**Figure 2**). To maintain appropriate monoallelic expression of imprinted genes in this locus, the KvDMR is hypomethylated on the paternal allele, leading to expression of the *KCNQ1OT1* lncRNA and subsequent repression of the maternal, protein-coding genes on the same allele (Fitzpatrick et al., 2002; Ideraabdullah et al., 2008), possibly by facilitating intra-chromasomal looping to direct the repressive PRC2 complex to their promoter (**Figure 2**; Zhao et al., 2010; Zhang et al., 2014).

The *KCNQ1* locus harbors at least two independently identified and replicated GWAS signals at SNPs located in the intron of the *KCNQ1* gene (rs2237892), with one overlapping the *KCNQ1OT1* lncRNA (rs231362; Unoki et al., 2008; Yasuda et al., 2008; Kong et al., 2009; Voight et al., 2010). Additional SNPs in this gene, such as rs2237895, are also reported to be associated with T2D risk in specific ethnic populations (Unoki et al., 2008). While these SNPs are predicted to confer risk for diabetes only when maternally inherited (Kong et al., 2009), the risk alleles do

#### **FIGURE 2 | Proposed model of imprinting at the** *KCNQ1* **locus:** the KCNQ1OT1 lncRNA is expressed from the paternally unmethylated KvDMR ICR, which is methylated on the maternal allele. Recent evidence suggests that KCNQ1OT1 can directly recruit the PRC2 complex and facilitate intra-chromosomal looping to the KCNQ1 promoter (Zhang et al., 2014).

not correlate with each other (Kong et al., 2009; Voight et al., 2010) and have opposing effects on docking of insulin granules (Rosengren et al., 2012).

To investigate how these T2D risk variants may affect allelic expression and imprinting of this region, Travers et al. (2013) correlated the risk SNP genotypes with DNA methylation and expression patterns of the imprinted genes in human fetal pancreas and adult islets. This study revealed that fetal samples homozygous for the rs2237895 risk allele had marginally increased methylation levels at the KvDMR region. As this was not observed in the adult, these results suggest that effects of the risk allele are likely be established during early stages of islet development, as *KCNQ1* and *KCNQ1OT1* are only imprinted in fetal but not adult tissues (Monk et al., 2006; Travers et al., 2013). Overall, this study proposes a model whereby each risk allele for the rs2237895 SNP leads to increased methylation of the KvDMR, and consequently, decreased expression of *KCNQ1OT1*. However, there was no observable difference in *KCNQ1* or *KCNQ1OT1* expression in samples used for this study. On the contrary, *KCNQ1OT1* transcript levels have been shown to be significantly elevated in T2D islets (where SNP genotype was not determined; Morán et al., 2012), which parallels an overall decrease in methylation at several tested CpGs near the *KCNQ1* gene (Dayeh et al., 2014). Thus, the interpretation of variants to disease pathology at this region has been contradictory and challenging. Nevertheless, the regulation of this locus and the lncRNA *KCNQ1OT1* remains relevant to β-cell biology and T2D pathogenesis.

#### *H19–IGF2* **LOCUS**

The *H19–IGF2* locus resides adjacent to the *KCNQ1* region on human 11p15.5. The region consists of the paternally expressed *insulin-like growth factor 2* (*IGF2*) gene and maternally expressed *H19* lncRNA (Brannan et al., 1990; DeChiara et al., 1990; Bartolomei et al., 1991). The *IGF2* protein functions as a growth factor essential for embryonic development (DeChiara et al., 1990), whereas *H19* may function as a tumor suppressor (Hao et al., 1993). Imprinting at this locus is maintained by an ICR, which is selectively methylated on the paternal allele. The insulator protein, CCCTCbinding factor (*CTCF*), binds to critical regulatory regions in the unmethylated ICR on the maternal allele, thus blocking access of downstream enhancers to the *IGF2* promoter (**Figure 3**; Stadnick et al., 1999; Bell and Felsenfeld, 2000; Engel et al., 2004).

Loss of methylation at the *H19/IGF2* ICR results in short body length and low birth weight, both in rodent models (DeChiara et al., 1990) as well as in humans, such as patients with Silver-Russell syndrome, a developmental disorder characterized by intrauterine and postnatal growth retardation (Gicquel et al., 2005). This has also been observed in humans who were periconceptually exposed to famine (Heijmans et al., 2008). There is growing evidence that intra-uterine exposure to malnutrition can predispose the offspring to metabolic complications including β-cell dysfunction and diabetes later in life (Ravelli et al., 1998; Roseboom et al., 2006). This theory is commonly referred to as the "thrifty phenotype hypothesis" (Hales, 2001)

and is thought to be mediated primarily through environmentally induced epigenetic changes to key metabolic regulators (Park et al., 2008; Bramswig and Kaestner, 2012). However, first and second generation progeny of mice exposed to gestational diabetes were found to have impaired glucose tolerance with hypermethylation of the *H19* ICR in islets (Ding et al., 2012). These contradicting observations may be a result of different nutrient availability that the developing fetus was exposed to, as well as the varying lengths of exposure. The above studies suggest that the *H19–IGF2* locus is highly responsive to these changes in the intrauterine milieu and may represent a prognostic marker of metabolic complications later in life.

Hypermethylation of the *H19–IGF2* ICR has been observed in some cases of BWS (Ohlsson et al., 1993), as well as in focal congenital hyperinsulinism (FoCHI), a glucose metabolism disorder characterized by unbridled insulin secretion from hyperplastic islet cells and consequent hypoglycemia (de Lonlay et al., 1997). Increased methylation at this ICR would be predicted to result in decreased *H19* expression, loss of imprinting at this region and a concomitant increase in *IGF2* expression. Although overexpression of *IGF2* in mouse β-cells recapitulates the FoCHI phenotype (Devedjian et al., 2000), *IGF2* expression was variable in human FoCHI lesions (Fournet et al., 2001). On the contrary, *H19* transcript levels were consistently down-regulated in these cells, suggesting that *H19* may have an important regulatory role in restraining islet-proliferation. This hyperproliferative phenotype, accompanied by suppression of *H19* has also been reported inWilms' tumor (Cui et al.,1997). Taken together, the*H19* lncRNA may function as a critical regulator of β-cell function and proliferation either on its own or indirectly through the regulation of *IGF2* levels.

#### *ZAC–HYMAI* **LOCUS**

Transient neonatal diabetes (TNDM) is a rare form of diabetes mellitus characterized by hyperglycemia and low insulin levels within the first year of birth (Temple et al., 2000). This form of diabetes is distinct from T1D as there is no evidence for autoimmunity (Abramowicz et al., 1994; Shield et al., 1997). Although it usually resolves by 2 years of age, children with TNDM are at a higher risk of developing T2D later in life (Temple et al., 2000). The molecular cause of this disease was identified to be abnormal imprinting of chromosome 6q24, which encompasses the cell cycle regulator, *ZAC/PLAGL1,* and the lncRNA, *HYMAI* (Abramowicz et al., 1994; Arima et al., 2000; Gardner et al., 2000; Kamiya et al., 2000; Mackay et al., 2002). Both *ZAC* and *HYMAI* share a common imprinted promoter (P1 in **Figure 4**), which also serves as the ICR, and are expressed from the paternal allele (Arima et al., 2000; Mackay et al., 2002). However, tissue-specific usage of an alternative promoter (P2 in **Figure 4**) that drives biallelic expression of *ZAC* has also been reported (Valleley et al., 2007).

*ZAC* encodes a zinc finger protein that regulates apoptosis and cell cycle arrest (Spengler et al., 1997). The protein is expressed at very high levels in insulin-producing cells in the human fetal pancreas, but not adult islets (Du et al., 2011). *ZAC* can also function as a transcriptional activator of *CDKN1C* and *KCNQ1OT1* (Arima et al., 2005). *ZAC* is believed to control the induction of the pituitary adenylate cyclase-activating polypeptide (*PACAP*), a strong activator of glucose-stimulated insulin secretion (Yada et al., 1994; Ciani et al., 1999). These features of the *ZAC* gene make this a strong candidate for the pathogenesis of TNDM. However, the mechanism of imprinting and the function of *HYMAI* in the context of TNDM have yet to be established.

ZAC/PLAGL1 and the lncRNA HYMAI are both paternally expressed from a common promoter that is also the ICR. However, in some tissues, ZAC is biallelically expressed from an upstream promoter.

# *MALAT1,* **AN ABUNDANT lncRNA**

The *metastasis-associated lung adenocarcinoma transcript 1 (MALAT1)* is a highly conserved lncRNA that is mis-regulated in several tumors (Ji et al., 2003; Gutschner et al., 2013). *MALAT1* is very abundantly expressed (higher than many housekeeping genes) in multiple cell types, including the pancreas (Ji et al., 2003) and in purified human α- and β-cells (Dorrell et al., 2011). Additionally,*MALAT1* is encoded within an active enhancer cluster with several binding sites for islet-transcription factors (Pasquali et al., 2014), making this is an intriguing candidate for gene regulation in human islets.

*Metastasis-associated lung adenocarcinoma transcript 1* has several interacting partners through which it may mediate its function. One such interacting partner is DGCR8, a doublestranded RNA binding protein that together with Drosha mediates miRNA bioprocessing (Macias et al., 2012). *MALAT1* was found to be bound to Argonaute (Ago), the primary effector of miRNA function in HeLa cells (Weinmann et al., 2009). *MALAT1* was also found to be associated with Ago in human islets, suggesting that this lncRNA may be regulated by miRNAs in human cells (Kameswaran et al., 2014). In fact, we discovered several sequences that consisted of miRNAs fused to MALAT1 while assaying miR-NAs and their targets that were bound to Ago in human islets. These chimeric reads were the result of ligation of two adjacent RNA species present in the RISC complex with Ago (Helwak et al., 2013), and proved that *MALAT1* is regulated by several miRNAs in human islets (Kameswaran et al., 2014).

*Metastasis-associated lung adenocarcinoma transcript 1* can also regulate gene expression through its association with different nuclear sub-compartments (Hutchinson et al., 2007; Yang et al., 2011; Gutschner et al., 2013). One example of this is *MALAT1* localization in nuclear speckles, which are nuclear domains where splicing factors are stored and post-transcriptionally modified (Hutchinson et al., 2007; Mao et al., 2011). Through the modification of critical splicing factors, *MALAT1* has been shown to contribute to alternative splicing (Tripathi et al., 2010). However, despite the abundance of this lncRNA and the early suggestions of its function from *in vitro* studies, mice lacking *MALAT1* displayed no obvious phenotype in the absence of additional pathological stressors and exhibit largely normal nuclear speckle formation and alternative splicing patterns (Eißmann et al., 2012; Nakagawa et al., 2012; Zhang et al., 2012). Thus, the role of this lncRNA remains to be determined.

# **PERSPECTIVE**

The exciting discovery of lncRNAs and the growing recognition of their involvement in human pathogenesis have added a new level of complexity to our understanding of gene regulation. However, due to the range of sequencing and bioinformatic tools currently available, the rate of discovery of new lncRNAs has surpassed our ability to examine their function. This gap between lncRNA gene discovery and function currently holds true in the field of β-cell biology as well, necessitating the systematic analysis of mouse and human islet lncRNAs identified to date (Ku et al., 2012; Morán et al., 2012).

Factors such as overlap between the human and mouse α- and β-cell lncRNA complements (Ku et al., 2012; Morán et al., 2012; Bramswig et al., 2013), degree of conservation, expression, associated protein-coding genes, and relative distance from GWAS SNP variants may be good early predictors of important lncRNAs. However, these parameters alone may underestimate other essential candidates, as some lncRNAs exhibit low primary sequence conservation despite crucial function (Nesterova et al., 2001), or, conversely, a dispensable function despite high sequence conservation and expression (Zhang et al., 2012). These observations emphasize the need for careful loss-of-function experiments in appropriate model systems induced by metabolic and/or inflammatory challenges to clearly understand the function of these lncRNAs. Although many of the human β-cell lncRNAs are expressed in the EndoC-βH1 cell line that somewhat resembles human β-cells *in vitro* (Ravassard et al., 2011), targeted deletion or inhibition in mouse and human islets may be necessary in some cases to reveal their function, as seen in the example of *HI-LNC25* discussed above (Morán et al., 2012).

While the loss-of-function of even abundant lncRNAs such as *MALAT1* may sometimes result in a lack of phenotype (Eißmann et al., 2012; Nakagawa et al., 2012; Zhang et al., 2012), lessons from the miRNA field suggest that additional physiological and environmental stressors may be necessary to truly elucidate the function of these non-coding RNAs (Mendell and Olson, 2012). Additionally, in order to study the role of lncRNAs in the context of loss-offunction, a careful analysis of the genomic location of the lncRNAs may be required to evaluate the best method of gene silencing, as targeted recombination may result in disruption of overlapping protein-coding transcripts or their regulatory domains, further confounding data interpretation.

Given the broad range of human diseases that lncRNAs are now associated with, it is perhaps not surprising that there is growing evidence for their role in β-cell function and diabetes pathogenesis. Revealing their function will undoubtedly lead to a new wave of exciting targets to explore for therapeutic development.

#### **ACKNOWLEDGMENTS**

We apologize to colleagues whose work we could not cite due to the limited scope of this review. We thank Dr. John Le Lay and other members of the Kaestner lab for useful discussions and suggestions on this manuscript. Related work in the Kaestner lab was supported through NIDDK grant R01-DK088383.

#### **REFERENCES**


gene network that may play a role in Beckwith-Wiedemann syndrome. *Nucleic Acids Res.* 33, 2650–2660. doi: 10.1093/nar/gki555


variants linked to type 2 diabetes. *Diabetes* 61, 1726–1733. doi: 10.2337/db11- 1516


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 May 2014; accepted: 15 June 2014; published online: 01 July 2014. Citation: Kameswaran V and Kaestner KH (2014) The Missing lnc(RNA) between the pancreatic* β*-cell and diabetes. Front. Genet. 5:200. doi: 10.3389/fgene.2014.00200 This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Kameswaran and Kaestner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Roles of lncRNAs in pancreatic beta cell identity and diabetes susceptibility

# *Timothy J. Pullen\* and Guy A. Rutter\**

Section of Cell Biology, Department of Medicine, Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK

#### *Edited by:*

Romano Regazzi, University of Lausanne, Switzerland

#### *Reviewed by:*

Scot J. Matkovich, Washington University School of Medicine, USA Lorenzo Pasquali, Institut d'Investigacions Biomèdiques August Pi i Sunyer, Spain

#### *\*Correspondence:*

Timothy J. Pullen and Guy A. Rutter, Section of Cell Biology, Department of Medicine, Imperial Centre for Translational and Experimental Medicine, Imperial College London, Du Cane Road, London W12 0NN, UK e-mail: t.pullen@imperial.ac.uk; g.rutter@imperial.ac.uk

Type 2 diabetes usually ensues from the inability of pancreatic beta cells to compensate for incipient insulin resistance. The loss of beta cell mass, function, and potentially beta cell identity contribute to this dysfunction to extents which are debated. In recent years, long non-coding RNAs (lncRNAs) have emerged as potentially providing a novel level of gene regulation implicating critical cellular processes such as pluripotency and differentiation. With over 1000 lncRNAs now identified in beta cells, there is growing evidence for their involvement in the above processes in these cells. While functional evidence on individual islet lncRNAs is still scarce, we discuss how lncRNAs could contribute to type 2 diabetes susceptibility, particularly at loci identified through genome-wide association studies as affecting disease risk.

**Keywords: type 2 diabetes, genome-wide association studies, lncRNAs, beta cell, islets of Langerhans, cell identity**

#### **INTRODUCTION**

Diabetes is a major and growing health problem affecting 347 m people worldwide (Danaei et al., 2011). Type 2 diabetes (T2D) accounts for 90% of affected individuals, and is a complex, progressive disease affected by a range of genetic and environmental risk factors. While insulin resistance plays a part in disease progression, pancreatic beta cell failure lies at the heart of T2D (Kahn, 2003).

Beta cells are the body's sole source of circulating insulin, and as such are critical for maintaining blood glucose within healthy limits. The beta cell's tightly regulated secretion of insulin in response to glucose is dependent on a highly specialized metabolic sensing system, underpinned by a specific pattern of gene expression. This includes specific genes required for glucose-sensing (e.g., *Gck* and *Glut2*), insulin production and processing (e.g., insulin, prohormone convertase 1/3 [PC1/3]) and regulated secretion. Equally important is the specific repression of genes which interfere with proper regulation of insulin secretion (e.g., *Slc16a1* and *Ldha*) (Zhao et al., 2001; Pullen et al., 2012). Indeed, we (Pullen et al., 2010) and others (Thorrez et al.,2011) have recently identified∼40 "forbidden" or "disallowed" genes specifically repressed in islets despite widespread expression across other tissues. While a unique pattern of transcription factors (e.g., Pdx1, Pax6, MafA, Nkx2.2) likely underlie the sustained expression of beta cell selective genes, epigenetic mechanisms are vital for the mitotically stable maintenance of cellular identity and probably the suppression of forbidden genes.

The tipping point from pre-diabetes to overt T2D is usually reached when beta cells can no longer compensate sufficiently for incipient insulin resistance. The nature of the progressive loss of functional beta cell mass has been the source of some debate (Weir and Bonner-Weir, 2013), with both decreases in overall beta cell mass (Butler et al., 2003) and impaired glucose-stimulated insulin secretion (GSIS) from the remaining islets being reported in T2D subjects (Del Guerra et al., 2005). Indeed, recent work has suggested that decreases in beta cell mass are modest (Rahier et al., 2008) and may even have been over-estimated (Marselli et al., 2014).

There has also been growing interest in a third option: that a progressive loss of beta cell identity underlies the development of T2D. Results from two rat models (Tokuyama et al., 1995; Jonas, 1999), and the *db/db* mouse model (Kjorholt et al., 2005) of T2D produce similar results showing a loss of differentiated beta cells accompanied by similar changes in gene expression: decreased expression of GSIS genes (e.g., *Gck*); increased expression of normally repressed genes (e.g., *Ldha* and *Hk1*); and decreases in islet transcription factors (e.g., *Isl1*, *Neurod1*, *Pdx1*, and *Pax6*) (Tokuyama et al., 1995; Jonas, 1999). The importance of these transcription factors in maintaining beta cell identity is underscored by the transformation of beta to alpha-like cells following *Pdx1* deletion (Gao et al., 2014). In *FoxO1* knockout mouse islets, beta cells dedifferentiate a step further, acquiring expression of immature endocrine cell markers (*Ngn3*, *Oct4*, *Nanog*, and *Mycl*) (Talchai et al., 2012). However, the relevance of this strain to human diabetic islets is as yet unestablished.

While the relative importance of these three factors (impaired beta cell mass, function and identity) in T2D is not fully understood, mechanisms which increase beta cell function, proliferation and identity are all sought to tackle the disease. Furthermore, increased understanding of the molecular defects underlying T2D will enable novel therapies to address the underlying causes of the disease, rather than just treat the symptoms.

A novel layer of gene regulation, acting partly through epigenetic mechanisms, was uncovered with the discovery that thousands of long non-coding RNAs (lncRNAs) are expressed from the genome (Guttman et al., 2009). lncRNAs are a diverse group of transcripts defined by a negative property: the lack of protein-coding potential. They are differentiated from short ncR-NAs such as miRNAs (which also play an important role: see Poy et al., 2004; Guay et al., 2011; Pullen et al., 2011) by a minimum length threshold of 200 nt. lncRNAs are often capped, spliced and polyadenylated like mRNAs although both unspliced and nonpolyadenylated variants are also common. As a class, lncRNAs are enriched in the nucleus relative to protein-coding transcripts, although there remains a population of cytoplasmic-enriched lncRNAs, and non-nuclear functions have been demonstrated for a number of lncRNAs (Derrien et al., 2012). Indeed lncRNAs have been demonstrated to regulate gene expression through a number of mechanisms at the transcriptional, post-transcriptional and translational level.

The expression patterns of lncRNAs are significantly more celltype specific than protein-coding genes (Guttman et al., 2009; Djebali et al., 2012), making them well placed to play highly cell-type specific roles. However, the fact that lncRNAs are less evolutionarily conserved at the primary sequence level than protein-coding genes and the fact the numerous lncRNAs appear to be species-specific raise questions over their functions. lncRNAs may depend more on secondary structure for their function, and thus be more tolerant of mutations than protein-coding genes. Indeed, lncRNA exons show evidence of low but clear sequence conservation compared to other intergenic regions (Guttman et al., 2009). The advent of next generation sequencing technologies has led to an explosion of novel lncRNA discovery in diverse tissues, including pancreatic islets and beta cells. The discovery of lncRNAs with the potential to regulate gene expression and cellular identity in T2D-relevant tissues offers the opportunity for greater understanding of disease etiology and novel targets for treatment.

#### **MECHANISMS OF lncRNA ACTION**

miRNAs are a major class of short non-coding RNA which primarily act via a single, well-characterized mechanism to downregulate gene expression through mRNA degradation and translational inhibition. In contrast, lncRNAs have been shown to regulate gene expression through a bewildering array of mechanisms (reviewed extensively in Wang and Chang, 2011; Kung et al., 2013). On average, lncRNAs are less abundant than protein-coding transcripts, although some (e.g., *H19*) are expressed at comparable levels (Djebali et al., 2012). lncRNAs are a heterogeneous group of transcripts, and greater functional characterization will hopefully allow clearer classification into distinct groups acting through distinct mechanisms.

One of the first identified modes of action of lncRNAs was via alterations in chromatin modifications through recruiting histone modifying complexes to particular genomic loci (Rinn et al., 2007). This plays a major part in epigenetic silencing during X-inactivation (through the lncRNA *Xist* (Zhao et al., 2008; Pontier and Gribnau, 2011)) and at imprinted loci (e.g., *Air* at the *Slc22a3* locus (Nagano et al., 2008)). Indeed, this appears

to be a widespread function of lncRNAs, with around 20% of intergenic lncRNAs identified in one study interacting with the chromatin-modifying complex Polycomb Repressive Complex 2 (PRC2) (Khalil et al., 2009). In addition to affecting histone modifications, lncRNAs can also act on the other major branch of epigenetic regulation, DNA methylation, through interaction with DNA methylation machinery (Mohammad et al., 2010; Di Ruscio et al., 2013). lncRNAs can further regulate gene expression by mediating DNA looping (Lai et al., 2013). This affects the threedimensional conformation of chromosomes which regulates gene expression by bringing distal enhancers physically adjacent to gene promoters.

lncRNAs also act through a number of mechanisms to regulate gene expression at the post-transcriptional level. Natural antisense transcripts (NATs) are lncRNAs transcribed in the opposite orientation to mRNAs with overlapping exons. They have the potential for direct interaction with mRNAs through complimentary basepairing, which can increase mRNA stability (e.g., *BASE1-AS* and *BASE1*) (Faghihi et al., 2008). In contrast, lncRNAs can also lead to mRNA degradation through the recruitment of Staufen1 (STAU1) (Gong and Maquat, 2011). lncRNAs can also influence mRNAs indirectly by acting as "sponges" for miRNAs, preventing them from downregulating gene expression (Kallen et al., 2013). It is interesting to note that this last example refers to the highly expressed *H19* lncRNA, and these post-transcriptional mechanisms presumably require lncRNAs in comparable abundance to their mRNA or miRNA targets. Given the lower expression of most lncRNAs relative to mRNAs these mechanisms requiring abundant lncRNAs are perhaps the exception rather than the norm.

## **lncRNAs AND BETA CELL IDENTITY**

As the expression of many lncRNAs is highly cell-type specific, early catalogs of lncRNAs mainly from other cell lines provided little information relevant to beta cells. However, a number of studies have since identified lncRNAs genome-wide in human and mouse, islets and beta cells.

Morán et al. (2012) used combination of ChIP-Seq (to identify sites of active transcription), and RNA-Seq (to directly detect transcripts) to produce a comprehensive catalog of >1100 lncR-NAs expressed in human islets. In agreement with studies in other tissues, these lncRNAs proved to be significantly more cell-type specific than protein-coding genes. Indeed, 55% of intergenic lncRNAs and 40% of antisense lncRNAs identified in this study were specific to islets. Importantly, Morán et al. (2012) identified several lncRNAs that were dysregulated in islets from T2D subjects.

One challenge to the theory of widespread functions for lncR-NAs is the low level of sequence conservation between species. To address this, orthologous mouse genomic regions were detected for 70% of human lncRNA loci, and RNA-Seq revealed that 47% of these were actively transcribed in mouse islets (Morán et al., 2012). Indeed this may be an underestimate as several lncR-NAs were detected by qPCR despite falling below the threshold for RNA-Seq detection. An independent transcriptomic study of mouse islets identified a similar number (>1000) of intergenic lncRNAs (Ku et al., 2012). Further dissection of islet cell-types was performed by Bramswig et al. (2013), with the identification of five alpha cell-specific and twelve beta cell-specific lncRNAs.

The scarcity of lncRNAs identified in this study may be due to highly stringent removal of any transcripts overlapping repeat regions.

Previous studies have identified both lncRNAs that regulate the maintenance of pluripotency, and lncRNAs required for neuronal differentiation (Ng et al., 2012). Indeed, the regulation of differentiation and cell identity appears to be a major function of lncRNAs. Supporting this role for islet lncRNAs, Morán et al. (2012) identified dynamic regulation of lncRNAs during *in vitro* differentiation of human embryonic stem cells towards pancreatic endocrine cells. As such protocols have thus far failed to produce fully functional beta cells without an *in vivo* maturation stage in mice, perhaps the most tantalizing discovery was six lncRNAs whose expression was only activated during this last stage. Understanding regulators of this critical maturation stage, and the ability to recapitulate it without passage through animals would be essential for any therapeutic generation of beta cells from human stem cells. However, islet lncRNAs are not restricted to developmental roles, as depletion of a beta cell-specific lncRNA was also shown to influence gene expression in mature beta cells. Depletion of the lncRNA *HI-LNC25* decreased expression of the transcription factor and monogenic diabetes gene, *GLIS3* (**Table 1**) (Morán et al., 2012). The location of *HI-LNC25* and *GLIS3* on separate chromosomes (20 and 9, respectively) indicates that the lncRNA regulates *GLIS3* in *trans*, although whether this is through a direct interaction with this locus has not been explored.

### **lncRNAs IMPLICATED IN T2D SUSCEPTIBILITY**

Within the last decade, genome-wide association studies (GWAS) have been a major focus of work to identify the genetic variants underlying susceptibility to T2D and related metabolic traits. Through identifying single nucleotide polymorphisms (SNPs) which correlate with diabetic phenotypes, investigators aim to single out genes which influence disease susceptibility. While much of the interpretation of GWAS hits has focussed on proteincoding genes, there are good reasons to indicate that lncRNAs are responsible for the effects of some of these SNPs.

It is striking how few SNPs identified through GWAS for T2D, indeed for most diseases, result in changes to protein sequences. *SLC30A8* is one of the few examples from early studies where a SNP (rs13266634) resulted in an amino acid substitution (R325W) affecting the zinc transporter located on insulin granules (Sladek et al., 2007). Interestingly, expression of *SLC30A8* is largely restricted to pancreatic islets, so the effects of any mutations are expected to be limited to the endocrine pancreas. In contrast, genes at most other GWAS loci are more widely expressed. Missense mutations at these widely expressed loci are far more likely to cause

defects in multiple tissues and possibly embryonic lethality, which could account for absence from GWAS specific for T2D and related traits. In a more recent large-scale meta-analysis which aimed to finely map the causal SNPs, only two of the 65 T2D susceptibility loci examined had a lead SNP resulting in a missense mutation (*PPARG* [rs1801282] and *KCNJ11* [rs5215]) (Morris et al., 2012). In both cases, rare severe mutations have previously been identified which cause monogenicforms of diabetes (Barroso et al.,1999; Gloyn et al., 2004). Most SNPs instead map to intronic or intergenic regions and thus likely act through altering gene expression and/or splicing.

While the effects of changes to protein sequences are relatively straightforward to investigate, it is often unclear which genes are affected by SNPs in non-coding regions. Whereas proximal promoters are directly adjacent to the genes they regulate, enhancer elements can influence the expression of genes hundreds of kilobases away, and can be interspersed between and even within other genes (Ilnytska et al., 2009). Interestingly, SNPs associated with T2D and fasting glycaemia are enriched in these distal regulatory regions (Pasquali et al., 2014). The gene affected by SNPs falling outside proximal promoters may not be clear, although the closest gene is often used as a starting point for further investigation. With the identification and annotation of increasing numbers of lncRNAs it has become apparent that a number of the GWAS hits fall close to, or within lncRNAs.

Mapping SNPs at these 65 loci to current Ensembl annotations shows five lead SNPs are within the exons or introns of lncRNAs (**Table 2**). Two of these loci (*KCNQ1* and *CDKN2A/CDKN2B*) contain well characterized lncRNAs for which studies have started to reveal the mechanisms through which they function. At the *PROX1* and *PSMD6* loci, in both cases the lead SNP (rs2075423 and rs12497268) falls within the exon of an annotated antisense orientated lncRNA. At the *CCND2* locus, the lead SNP (rs11063069) is approximately 9 kb upstream of the proteincoding gene, yet within the intron of two antisense lncRNAs (*CCND2-AS1* and*CCND2-AS2*). Finally, as further islet-expressed lncRNAs are identified more of these SNPs are found to fall closer to lncRNAs than protein-coding genes (e.g., *WFS1* locus (Morán et al., 2012)).

#### *KCNQ1*

The *KCNQ1* locus contains a number of genes with the potential to affect beta cell function and proliferation. Within this locus *KCNQ1*, *KCNQ1OT1*, *CDKN1C*, *PHLDA2*, *SLC22A18*, and *SLC22A18AS* are all expressed in both fetal pancreas and adult islets (Travers et al., 2013). *KCNQ1* encodes a voltage-gated potassium channel subunit which is expressed in pancreatic beta cells

**Table 1 |The main lncRNAs discussed in this paper along with their tissue and species specificity.**


although its effect on the regulation of insulin secretion is somewhat unclear. siRNA knockdown of *KCNQ1* in human islets enhanced depolarization-induced exocytosis (Rosengren et al., 2012), whereas pharmacological inhibition in INS-1 cells did not affect basal, tolbutamide or GSIS (Ullrich et al., 2005). *CDKN1C* encodes the p57*KIP*<sup>2</sup> cyclin-dependent kinase inhibitor which is a negative regulator of cell proliferation. It is expressed in islets, and loss of p57*KIP*<sup>2</sup> expression is associated with increased beta cell proliferation in focal congenital hyperinsulinism (Kassem et al., 2001; Henquin et al., 2011). *PHLDA2* also exerts a negative effect on cell proliferation, with increased expression in the placenta being associated with intrauterine growth retardation though uteroplacental insufficiency (McMinn et al., 2006). In contrast, decreased expression of *PHLDA2* was detected in neuroendocrine tumors relative to normal islet controls, as a downstream effect of losing the tumor suppressor gene *MEN1* resulting in Multiple Endocrine Neoplasia type 1 (Dilley et al., 2005).

The lead SNP at this locus is within the intron of the *KCNQ1* protein-coding gene, yet the lead SNP for a putative secondary association signal is in the exon of an antisense lncRNA (*KCNQ1OT1*) (Morris et al., 2012). *KCNQ1OT1* is a 91 kb long transcript encoded by RNA polymerase II and localized exclusively in the nucleus (Pandey et al., 2008). In early development, *KCNQ1OT1* is expressed in a monoallelic fashion from the paternal allele and has been linked to silencing of nearby genes on the same chromosome, resulting in these genes being expressed exclusively from the maternal allele. This is mediated through the interaction of *KCNQ1OT1* with PRC2 components and G9a histone methyltransferase resulting in enrichment of repressive histone modifications H3K27me3 and H3K9me3 at this locus (Pandey et al., 2008). However, this pattern is complicated by the lineage-specific loss of imprinting at some genes. Specifically, *KCNQ1* and *KCNQ1OT1* showed developmental loss of imprinting with biallelic expression in adult islets (Travers et al., 2013). Interestingly, in the developing mouse heart, *Kncq1ot1* is required for paternal silencing of *Cdkn1c* and *Slc22a18*, but not *Kcnq1* indicating that *Kcnq1ot1* may be more directly involved in the regulation of these two genes (Korostowski et al., 2012).

One proposed model to explain the effect of SNPs at this locus is that risk alleles reduce *KCNQ1OT1* expression, thereby decreasing repressive histone modifications, increasing *CDKN1C* expression and thereby impairing islet proliferation or development (Travers et al., 2013). In contrast to this model, *KCNQ1OT1* was reported to be upregulated in T2D islets (Morán et al., 2012). However, this apparent contradiction may be explained by risk SNPs having an effect on imprinting early in development, whereas the *KCNQ1OT1* upregulation in T2D adult islets may be part of the compensatory proliferative response of islets to hyperglycaemia.

#### *ANRIL*

*CDKN2A*/*CDKN2B* is another T2D GWAS locus containing a well-characterized lncRNA. This locus encodes three tumorsuppressors: p16*INK*4*A*, p14*ARF* , and p15*INK*4*B*. The expression of these genes is regulated by polycomb-mediated silencing through H3K27me3. Of particular relevance to diabetes, p16*INK*4*<sup>A</sup>* upregulation has been shown to be responsible age-dependent decline in beta cell proliferative capacity (Krishnamurthy et al., 2006). The histone methyltransferase and PRC2 component, Ezh2, represses p16*INK*4*<sup>A</sup>* expression in young beta cells, permitting proliferation. However, declining Ezh2 levels in aging beta cells lead to derepression of p16*INK*4*<sup>A</sup>* expression, and consequent inhibition of proliferation (Chen et al., 2009).

The lead SNP (rs10811661) at this locus is downstream of, and closest to the lncRNA *ANRIL* (*CDKN2B-AS1*). The lead SNP for a putative secondary association signal (rs944801) is located within an intron of *ANRIL* (Morris et al., 2012). This region (chromosome 9p21) has been strongly associated with susceptibility to a number of diseases. SNPs associated with coronary disease, stroke, melanoma and glioma are all located close to *ANRIL* and correlated with *ANRIL* expression (Cunnington et al., 2010; Pasmant et al., 2011). Although no direct mouse ortholog of *ANRIL* has been reported, a lncRNA of unknown function has been detected at the orthologous region of the mouse genome. Deletion of part of the "9p21" orthologous region in mice (including a section of this lncRNA) increased mortality and affected expression of neighboring genes (Visel et al., 2010), indicating that gene regulatory sequences and possibly this lncRNA perform roles conserved between human and mouse.

*ANRIL* interacts with both PRC1 (CBX7) and PRC2 (SUZ12) components to affect the epigenetic regulation of gene expression



SNPs reported in a meta analysis of GWA studies (Morris et al., 2012) overlapping lncRNA annotations in Ensembl (assembly GRCh38). \*SNP also falls within an intron of the protein-coding gene.

at this locus (Yap et al., 2010; Kotake et al., 2011). Competitive inhibition of *ANRIL* was reported to increase *p16INK*4*<sup>A</sup>* expression resulting in decreased proliferation in fibroblasts (Yap et al., 2010). A separate depletion of *ANRIL* in lung fibroblasts also reported decreased proliferation, although in this case primarily through upregulation of *p15INK*4*B*. The direction of this effect fits with reports of T2D SNPs being associated with decreased *ANRIL* expression, although not in all populations (Cunnington et al., 2010). It therefore appears that *ANRIL* recruits both PRC1 and PRC2 to this locus in a coordinated manner to repress these proliferation inhibitors. Disruption of *ANRIL* expression or function by SNPs could impair silencing at this locus decreasing the proliferative capacity of beta cell.

One should caution that*ANRIL* may not be limited to cis-acting effects, as overexpression of *ANRIL* in HeLa cells has also been reported to produce genome-wide effects on gene expression (Sato et al., 2010). An interesting direction for future research may thus be to determine how regulation of this locus by *ANRIL* integrates with the PDGF pathway responsible for age-dependent decline in beta cell proliferative capacity (Chen et al., 2011). Whereas ectopic expression of *Ezh2* was sufficient to increase beta cell proliferation in young mice, it was unable to repress *p16INK*4*<sup>A</sup>* in older mice without combined inhibition of trithorax group proteins (Zhou et al., 2013). It would be of particular interest to discover whether manipulation of *ANRIL* in adult beta cells can reverse this block on proliferation, or whether its effects are primarily in developing beta cells.

It is interesting to note that *ANRIL* and *KCNQ1OT1* are both expressed in numerous tissues, yet certain SNPS within them are associated specifically with T2D susceptibility. It may be that beta cell replication during compensation to insulin resistance is particularly sensitive to *ANRIL* function meaning that these cells are most severely affected by *ANRIL* variants. Alternatively, *ANRIL* may be involved in beta cell-specific interactions which are uniquely affected by T2D SNPs. While the former proposition appears more likely, functional studies from one cell-type may not be directly applicable to others. The lack of a clear mouse ortholog of *ANRIL* has also prevented the use of mouse transgenics for investigating its function. Further investigation into the relationship between lncRNAs expressed in human and mouse at this locus would provide valuable insight into whether mouse experiments could be used to study *ANRIL* function.

# **lncRNAs AS THERAPEUTIC TARGETS**

The discovery of lncRNAs implicated in susceptibility to, and the progression of, T2D raises the question of whether lncRNAs can be therapeutically manipulated to ameliorate this disease. lncR-NAs offer a number of advantages as therapeutic targets. Firstly, the highly cell-type specific expression pattern of many lncRNAs is likely because they are involved in the cell-type specific regulation of genes which themselves are more widely expressed. Manipulating lncRNAs could therefore allow the effects to be specifically targeted to a single cell-type, with few side effects.

Furthermore, lncRNAs are involved in the epigenetic regulation of fundamental cellular processes such as pluripotency and cell identity. Much of the protein machinery regulating the epigenetic landscape is widely expressed, making targeting any intervention

difficult. By acting as scaffolds to bring together particular combinations of chromatin-modifying complexes, transcription factors, etc. and target them to specific genomic loci, lncRNAs can provide specificity to this system. As such, lncRNAs may also provide a specific target to influence the epigenetic landscape underlying beta cell identity, and the potential to reinforce beta cell identity when this is challenged by T2D.

In contrast to lncRNAs, the more clearly defined mechanisms of miRNA processing and function have allowed greater progress to be made towards therapeutic targeting of these short non-coding RNAs. The most successful route has been to inhibit miRNAs *in vivo* by administrating modified oligonucleotides systemically (Montgomery and van Rooij, 2011). Although the short length of miRNAs makes them easy to target using short complementary oligonucleotides, a similar approach termed "antagoNATs" uses short antisense oligonucleotides to target lncRNAs. Brainderived neurotrophic factor *(BDNF-AS)* is a brain expressed NAT which inhibits the *BDNF* gene. Continuous *in vivo* delivery to mouse brain via an osmotic minipump stably increased BDNF expression in a highly locus-specific effect, with a corresponding increase in neurite outgrowth and differentiation (Modarresi et al., 2012). While this form of delivery may be of limited use in clinics, advances in delivery of other modified oligonucleotides, such as siRNAs, will likely increase the feasibility of using this approach more widely (Gavrilov and Saltzman, 2012). Whether similar approaches could be used to target diabetic beta cells relies on the identification and characterization of islet-specific lncRNAs with negative impacts on beta cell identity, function or proliferation.

Therapeutic increases in miRNA function have been possible using chemically synthesized miRNA mimics. While the mimics themselves are highly effective, targeting their delivery to a particular cell type is not. The use of adeno-associated viral (AAV) vectors offers the possibility to target miRNA expression to particular tissues through both viral tropism and the use of tissuespecific promoters (Montgomery and van Rooij, 2011). Synthetic lncRNA mimics would not be practical due to the greater length, but viral expression approaches used for miRNAs could well be adapted for lncRNA delivery. However, as some models propose lncRNAs function through recruiting proteins to the nascent transcript, the site of expression may be critical for their functions. Ectopic expression may not be able to recapitulate the full effects of endogenous lncRNAs. Another concern is the stage of development at which lncRNAs act. T2D susceptibility at the *KCNQ1OT1* locus involves the imprinting, which is lost in adult beta cells, suggesting that this lncRNA plays a role in beta cell development. It is therefore possible that manipulation of *KCNQ1OT1* in adult beta cells would be unable to ameliorate any developmental defects. To counter this view, there are a large number of lncRNAs expressed in adult beta cells, including some only found in mature beta cells. Furthermore, the regulation of *GLIS3* by one islet lncRNA demonstrates that significant diabetes genes are regulated in adult beta cells by lncRNAs (Morán et al., 2012).

# **CONCLUSION**

The large number of highly cell-type specific lncRNAs identified in recent years provides the potential for significant regulation of gene expression and cell phenotype in pancreatic beta cells. However, the functional characterization of these lncRNAs has inevitably lagged behind their discovery. So far there is evidence for the epigenetic regulation of two highly significant T2D loci by lncRNAs. In addition there is more circumstantial evidence that a number of further lncRNAs are associated with beta cell development and function. Whether lncRNA expression changes during beta cell compensation to insulin resistance, potentially playing a role in beta cell expansion under these conditions, is also an important area for investigation, and may reveal the potential for therapeutic targeting. Further detailed investigation of the specific lncRNAs involved in this process are required to reveal the true extent to which lncRNAs regulate beta cell identity, proliferation and function.

### **ACKNOWLEDGMENTS**

Supported by a Diabetes Research and Wellness Fund Non-clinical Fellowship to Timothy J. Pullen and Wellcome Trust Senior Investigator (WT098424AIA), MRC Programme (MR/J0003042/1), Diabetes UK Project Grant (11/0004210) and Royal Society Wolfson Research Merit Awards to Guy A. Rutter. The work leading to this publication has received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement no. 155005 (IMIDIA), resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013) and EFPIA companies' in kind contribution (Guy A. Rutter). We thank Prof. Jorge Ferrer for useful discussion.

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 May 2014; accepted: 12 June 2014; published online: 01 July 2014.*

*Citation: Pullen TJ and Rutter GA (2014) Roles of lncRNAs in pancreatic beta cell identity and diabetes susceptibility. Front. Genet. 5:193. doi: 10.3389/fgene.2014.00193 This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Pullen and Rutter. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Long non-coding RNA-dependent transcriptional regulation in neuronal development and disease

# *Brian S. Clark1 and Seth Blackshaw1,2,3,4,5 \**

<sup>1</sup> Solomon Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA

<sup>2</sup> Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD, USA

<sup>3</sup> Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA

<sup>4</sup> Center for High-Throughput Biology, Johns Hopkins University School of Medicine, Baltimore, MD, USA

<sup>5</sup> Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA

#### *Edited by:*

Yingqun Huang, Yale University School of Medicine, American Samoa

*Reviewed by:*

Antonio Sorrentino, Exiqon A/S, Denmark Amelia Cimmino, Consiglio Nazionale delle Ricerche, Italy

#### *\*Correspondence:*

Seth Blackshaw, Solomon Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Miller Research Building Room 329, 733 North Broadway, Baltimore, MD 21205, USA e-mail: sblack@jhmi.edu

Comprehensive analysis of the mammalian transcriptome has revealed that long noncoding RNAs (lncRNAs) may make up a large fraction of cellular transcripts. Recent years have seen a surge of studies aimed at functionally characterizing the role of lncRNAs in development and disease. In this review, we discuss new findings implicating lncRNAs in controlling development of the central nervous system (CNS). The evolution of the higher vertebrate brain has been accompanied by an increase in the levels and complexities of lncRNAs expressed within the developing nervous system. Although a limited number of CNS-expressed lncRNAs are now known to modulate the activity of proteins important for neuronal differentiation, the function of the vast majority of neuronal-expressed lncRNAs is still unknown. Topics of intense current interest include the mechanism by which CNS-expressed lncRNAs might function in epigenetic and transcriptional regulation during neuronal development, and how gain and loss of function of individual lncRNAs contribute to neurological diseases.

**Keywords: cell fate, neurogenesis, embryonic stem cells, neural stem cells, transcription factors, epigenetics, long noncoding RNA, molecular scaffold**

# **IDENTIFICATION, CONSERVATION, AND DIVERSITY OF lncRNAs**

Annotation and high-throughput deep sequencing of the transcriptomes of multiple species have led to the belief that much of the genome is transcribed; however, only a minority of transcribed sequences contain evolutionarily conserved open reading frames (Okazaki et al., 2002; Maeda et al., 2006; Kapranov et al., 2007, 2010; Derrien et al., 2012; Dunham et al., 2012). Many of the transcribed sequences are thus unlikely to encode proteins. Among all human non-coding transcripts, at least 10,000 are estimated to be >200 nucleotides, and are accordingly designated as long non-coding RNAs (lncRNAs; Derrien et al., 2012). Based on transcriptome analysis of protein coding genes (Okazaki et al., 2002), transcripts are typically classified as lncR-NAs when they do not contain any open reading frame >100 amino acids in length. Although few lncRNAs contain ORFs longer than predicted by pure chance (Dinger et al., 2008, 2011), they also show relatively low levels of evolutionary conservation overall, suggesting that they may encode short, evolutionarily divergent proteins similar to those observed in *Drosophila* (Kondo et al., 2010). Recently, researchers detected a number of evolutionarily conserved sequences that do encode small proteins through both ribosome profiling and mass spectrometry (Bazzini et al., 2014). However, analysis of other mass spectrometry experiments reveals that lncRNAs rarely produce detectable protein products (Banfai et al., 2012; Slavoff et al., 2013). Furthermore, ribosome profiling experiments have indicated that while lncRNAs can associate with ribosomes, ribosome occupancy of lncRNAs displays features more congruent with untranslated regions (5- UTRs) and other classical ncRNAs, such as small nucleolar RNAs (snoRNAs) and microRNAs (miRNAs; Guttman et al., 2013). Combined with data showing that a large fraction of lncRNA transcripts are retained in the nucleus (Derrien et al., 2012), it suggests that lncRNAs impart functions as RNA transcripts.

LncRNAs are distinguished from other ncRNAs subtypes by several different features. Inherent to the name, lncRNAs are classified as such based on a length of >200 nucleotides, distinguishing them from many ncRNAs including miRNAs, snoRNAs, and others. They are also distinct from transfer RNAs (tRNAs) as they are typically transcribed by RNA polymerase II (RNA Pol II), as opposed to RNA Pol III. Moreover, lncRNAs share many features with protein-coding messenger RNAs (mRNAs) – they are capped and polyadenylated. Many lncRNAs also contain multiple exons and are subjected to alternative splicing. However, in comparison to protein-coding transcripts, lncRNAs are roughly one-third as long, contain fewer exons (∼2.8 exons in lncRNAs compared to 11 exons for protein coding genes), and are expressed at 10-fold lower levels on average (Guttman et al., 2010; Cabili et al., 2011; Pauli et al., 2012). In addition, lncRNAs show a higher degree of tissue-specific expression than do proteincoding genes (Cabili et al., 2011). Compared to protein-coding genes, retrotransposon sequences and tandem repeat elements are more frequently included in lncRNA sequences (Ulitsky et al., 2011; Kelley and Rinn, 2012). These elements have been proposed to facilitate lncRNA function through either base pairing

with other RNAs with similar repeat sequences, or through as yet unidentified mechanisms (Gong and Maquat, 2011; Carrieri et al., 2012).

The discovery that much of the genome is transcribed bidirectionally has led to a diverse and still not fully standardized categorization of lncRNAs based on genomic localization. Included in the class of lncRNAs are enhancer-related lncR-NAs (eRNAs) or transcribed ultra-conserved region lncRNAs (**Figure 1A**), intronic lncRNAs (**Figure 1B**), large/long intergenic or intervening non-coding RNAs (lincRNAs; **Figure 1C**), promoter associated lncRNAs (**Figure 1D**), and natural antisense transcripts (NATs; **Figure 1E**). LincRNAs have been identified through examination of sequencing reads that map expressed transcripts without clearly defined ORFs to intergenic regions. These lincRNAs usually also possess signatures of active transcription including H3K4me3, polyadenylation signals, and RNA polymerase II occupancy (Guttman et al., 2009). LncRNAs not localized to intergenic regions have been less readily identified and originally described as transcription "noise" due to overlap with protein-coding transcripts or known DNA-regulatory elements such as enhancers.

Reports suggest that intronic lncRNAs, which comprise up to 35% of non-coding transcripts, form the largest single class of lncRNAs (Birney et al., 2007; St Laurent et al., 2012). Although intronic lncRNAs were originally thought to be unprocessed pre-mRNAs of protein coding genes, current estimates suggest that up to 80% of protein coding loci have transcriptionally active introns that are expressed independently from the protein coding pre-mRNA (Dermitzakis et al., 2005; Louro et al., 2008; St Laurent et al., 2012). Further confirmation of the presence of intronic lncRNAs comes from reports that find many intronic lncRNAs localized to the cytoplasm, excluding the possibilities that intronic lncRNAs are the result of genomic DNA

or unspliced pre-mRNA contamination in deep sequencing studies (Kampa et al., 2004; Kapranov et al., 2007; Mercer et al., 2008). Intronic lncRNAs are transcribed from either the sense or antisense strand of the protein-coding gene in which they are encoded, further supporting independent transcriptional regulation (Rinn et al., 2003; Bertone et al., 2004; Kampa et al., 2004).

A relatively new class of ncRNAs, eRNAs, result from bidirectional transcription of enhancers. These sequences display H3K4me1 and H3K27ac modifications, and p300/CBP and RNA polymerase II occupancy, and thus show signatures of open or poised chromatin (Heintzman et al., 2007, 2009; Visel et al., 2009; De Santa et al., 2010; Kim et al., 2010; Orom et al., 2010). While many eRNAs are short (<200 nt), there are a considerable number of lncRNAs that display lincRNA-like chromatin signatures and overlap known enhancer sequences. Because of the shared enhancer sequence, these have been further classified as transcribed ultra-conserved region-associated lncRNAs to distinguish them from shorter eRNAs.

NATs, on the other hand, were identified in bacteria and eukaryotes in the 1990s (Wagner and Simons, 1994; Vanhee-Brossollet and Vaquero, 1998). More recent studies indicate that 50–70% of protein coding genes are also transcribed in the antisense direction, with half of these antisense transcripts being non-coding (Carninci et al., 2005; Katayama et al., 2005; Galante et al., 2007). Studies have shown that many NATs display localized expression patterns that correspond inversely with their sense transcript counterparts, suggesting possible negative regulation of sense transcripts by NATs (Vanhee-Brossollet and Vaquero, 1998; Alfano et al., 2005). In contrast, many lncRNAs without overlapping sequence display expression patterns that correlate with nearby protein-coding transcripts (Luo et al., 2013).

#### **FIGURE 1 | Classification of lncRNAs based on genomic localization.** Schematic examples of the classification of lncRNAs based on genomic localization. **(A)** Enhancer-associated RNAs result from direct, bi-directional transcription of enhancer elements Ultra-conserved enhancer elements are frequently transcribed as part of lncRNA sequences. **(B)** Intronic lncRNAs localize to the introns of protein-coding genes and are transcribed from the anti-sense (pictured) or sense strand (not shown). **(C)** LincRNAs are localized to gene deserts, far removed from proximal promoter elements from

neighboring protein-coding genes. **(D)** Promoter associated lncRNAs are transcribed from segments within proximal promoters of the associated protein-coding gene on the anti-sense (opposite strand lncRNAs; shown) or sense (not shown) strand relative to the protein-coding gene. **(E)** Natural antisense lncRNAs are transcribed for the antisense strand of protein-coding genes and contain complementary sequences to segments of the mature mRNA. Protein-coding exons shown in yellow; lncRNA exons shown in red; overlapping sequence shown in purple.

Despite the tremendous diversity of lncRNAs, their functional importance has been underappreciated and relatively understudied, in part due to the fact that they often fail to show clear evolutionary conservation (Ulitsky et al., 2011; Basu et al., 2013). However, previous comparative genomic analyses have identified thousands of non-coding intergenic and intronic ultra-conserved sequence elements (UCEs) in the human genome (Bejerano et al., 2004; Sandelin et al., 2004). Analysis of the genomic localization of UCEs shows that UCEs are preferentially localized to loci encoding DNA-binding proteins (Sandelin et al., 2004). A recent study that incorporated transcriptome data from many different vertebrate species revealed that 4–11% of lncR-NAs are conserved across the vertebrate lineage, and many of these map to UCE loci (Ulitsky et al., 2011; Basu et al., 2013). Additionally, although the primary sequence of lincRNAs that are localized in close proximity to protein-coding genes often shows little sequence conservation, synteny between vertebrate lincRNAs and protein-coding genes is often conserved during vertebrate evolution (Ulitsky et al., 2011; Qu and Adelson, 2012a,b). Combined, this suggests that the synteny and evolutionary conservation of these non-coding elements helps facilitate the regulated expression of transcription factors through enhancer activity, functional ncRNA transcripts, or both.

Analysis of sequence conservation within transcribed and regulatory regions of individual lncRNAs suggest that the proximal promoters display highest levels of evolutionary conservation (Carninci, 2007; Ponjavic et al., 2007; Marques and Ponting, 2009; Chodroff et al., 2010). Peak conservation is observed ∼43 bp upstream of the transcription start site, similar to the level of conservation seen across mouse and human protein coding genes (Taylor et al., 2006; Chodroff et al., 2010). Furthermore, exonic sequence of lncRNAs is more highly conserved than intronic sequence, with exon splice sites showing highest evolutionary constraint (Chodroff et al.,2010). Short sequences within the lncRNAs are also frequently conserved. Stringent identifications of miR-NAs localized within lncRNA sequence identified 97 lncRNAs that function as potential precursors to miRNA clusters (He et al., 2008). These miRNA sequences display a minimum 98% homology between rat and mouse, a far greater sequence conservation than observed for lncRNAs as a whole (He et al., 2008).

Quantification of the number of lncRNAs present across multiple species has elicited a wide range of estimates for the number of vertebrate lncRNAs. Stringent estimates suggest that 1133 lncR-NAs are expressed during zebrafish development (Pauli et al., 2012). Consistent with an evolutionary increase in number, size, and divergence of regulatory elements as species become more complex (Mazumder et al., 2003; Frith et al., 2005; Taft et al., 2007), conservative estimates from recent GENCODE sequencing builds in mouse and humans (July, 2013) indicate the presence of 4074 and 13,870 lncRNAs, respectively (Derrien et al., 2012; Harrow et al., 2012). Estimates from mouse suggest that 849 of the 1328 lncRNAs examined by *in situ* hybridization show specific expression patterns in the adult brain (Mercer et al., 2008). More comprehensive analysis using RNA deepsequencing technologies will help further elucidate and identify the exact number of lncRNAs expressed during neuronal development.

Of the 13,870 identified human lncRNAs, approximately onethird are unique to the primate lineage (Derrien et al., 2012), suggesting that ncRNA-dependent regulation of brain development may have contributed to the evolution of higher cognitive functions (Barry and Mattick, 2012; Barry, 2014). Consistent with this idea, 47 of 49 conserved sequences across evolution displayed sequence substitution rates statistically higher between human and chimpanzees than rates compared to other sequences across amniote evolution (Pollard et al., 2006). Of these human accelerated regions (HARs) that are non-coding, a quarter of these mapped to locations adjacent to genes that regulate neural development (Pollard et al., 2006). HAR1F, the most rapidly evolving sequence of all, encodes a lncRNA that is prominently expressed in the developing and adult brain. Although the function of HAR1F is still unknown, this presents a tantalizing link between lncRNAs and the formation of the proportionally larger and more complex human brain (Pollard et al., 2006).

The large number of lncRNAs that display neuronal-specific expression suggests an important role of lncRNAs in the neuronal diversification seen in higher vertebrates (Cao et al., 2006; Amaral et al., 2008; Chodroff et al., 2010; Qureshi et al., 2010). Additionally, the spatially and temporally restricted expression patterns of many lncRNAs indicate that their expression is tightly regulated, suggesting that lncRNAs may control the specification and function of individual neuronal subtypes (Mercer et al., 2008). While functional characterization of neuronal-enriched lncRNAs is still limited, broader studies of lncRNA function have implicated lncRNAs as regulators of transcription through both epigenetic regulations of chromatin structure and RNAtranscription factor interactions. Here we focus on reviewing recent advances in the identification and functional analysis of lncRNAs implicated in transcriptional regulation control of neural development.

# **MECHANISMS OF lncRNA-DEPENDENT TRANSCRIPTIONAL REGULATION**

In general, lncRNAs function either in *cis*, within the same genomic locus, or in *trans*, affecting gene transcription in a different locus or even on different chromosomes. Many lncR-NAs, including the intensely studied *Xist* and *HOTAIR* ncR-NAs, function through recruitment of the Polycomb repressive complex 2 (PRC2) by binding to PRC2 component histonelysine N-methyltransferase Ezh2, leading to a local increase in H3K27me3 content and subsequent transcriptional repression (Zhao et al., 2008, 2010; Tsai et al., 2010; Guil and Esteller, 2012). However, other lncRNAs, like the *BORDERLINE* lncRNA, are shown to inhibit repressive histone modifications either solely through their transcription or by binding to and removing the heterochromatin protein 1 (HP1/Swi6) from the locus (Keller et al., 2013). The diverse functions observed for the handful of characterized lncRNAs studied so far underscore the importance of analyzing lncRNA function on an individual basis.

### **NATURAL ANTISENSE TRANSCRIPTS**

Natural antisense RNAs transcripts are lncRNAs that are transcribed from the opposite strand (OS) of protein-coding genes, and therefore, share sequence complementarity. The degree of complementarity of NATs with corresponding sense transcripts varies greatly, however, genome-wide analysis suggests that localization of antisense transcription is generally confined to 250 bp upstream of the sense transcript's transcription start site and 1.5 kb downstream of the sense gene (Sun et al., 2005; Core et al., 2008; Seila et al., 2008). As previously reviewed, NATs mediate their function through transcriptional and epigenetic regulation, RNA–DNA interactions, and RNA–RNA interactions (Faghihi and Wahlestedt, 2009; Magistri et al., 2012). While there are clear examples of antisense transcripts that directly inhibit protein coding gene expression (Werner et al., 2014), the inhibition is probably not mediated by complementary basepairing of sense–antisense transcripts. Since most lncRNAs are expressed at much lower levels than neighboring protein-coding genes, the stoichiometry between sense–antisense pairs is insufficient to simply block splicing or translation of protein coding genes.

One topic of particular current interest is the role of NATs that work in conjunction with epigenetic modifiers. Many imprinted genes are found in genomic clusters and have NATs located within the same locus (Verona et al., 2003; Katayama et al., 2005; Wan and Bartolomei, 2008; Mohammad et al., 2009). The imprinted locus is facilitated through allele specific expression of NATs and corresponding interactions with epigenetic modifiers. For example, the NAT *Air* interacts with HMT G9a while *Kcnq1ot1* interacts with PRC2 components and HMT G9a (Nagano et al., 2008; Pandey et al., 2008; Terranova et al., 2008). Through complementary base pairing and RNA–protein interactions, the NAT transcript allows sequence-specific recruitment of chromatin modifiers to the locus. For both *Air* and *Kcnq1ot1*, NAT expression from the paternal allele corresponds to paternal allele silencing through chromatin condensation and bidirectional spreading of epigenetic marks (Nagano et al., 2008; Kanduri, 2011). Epigenetic control of protein coding genes by NATs is also observed in nonimprinted loci. For example, brain-derived neurotrophic factor (BDNF) is regulated by the NAT *BDNF-AS*. Loss of *BDNF-AS* is accompanied by increased *BDNF* transcript abundance, facilitated through an altered chromatin state (Modarresi et al., 2012).

#### **INTRONIC ncRNAs**

While reports suggest that up to 35% of lncRNAs localize to intronic sequences, little is known about the function of these sequences (St Laurent et al., 2012). Surprisingly, intronic ncR-NAs are predominantly associated with the sense strand of the unprocessed mRNA, but often show expression patterns that are inversely correlated with the processed mRNA (Katayama et al., 2005; Nakaya et al., 2007; Dinger et al., 2008; Mercer et al., 2008). This suggests a complex regulatory relationship in which intronic ncRNA transcription may be independent of transcription of the protein coding pre-mRNA. In some cases, these intronic ncR-NAs are precursor transcripts to miRNAs. Recent work has also suggested that many intron-derived RNAs bind to Ezh2 of the PRC2 complex, thus recruiting chromatin structure modifiers to the locus to silence transcription (Guil and Esteller, 2012; Guil et al., 2012).

# **NON-CODING OPPOSITE-STRAND TRANSCRIPTS (ncOSTs), PROMOTER-ASSOCIATED lncRNAs, ENHANCER-ASSOCIATED RNAs (eRNAs), ULTRACONSERVED ELEMENT-ASSOCIATED lncRNAs, AND CIRCULAR RNAs**

Some lncRNAs are transcribed from the proximal promoters in the opposite direction of protein coding genes, and have been termed "opposite strand" transcripts. Conservative estimates suggest that one-third of brain-enriched transcription factors express corresponding *OS* transcripts and that many of these may act in *cis* to regulate protein-coding gene transcription (Alfano et al., 2005; Rapicavoli et al., 2010). Many *OS* transcripts display correlated expression patterns with neighboring protein-coding genes as a result of bi-directional promoters initiating transcription of both the lncRNA and protein-coding gene (Uesaka et al., 2014). Recent reports analyzing the function of *Six3OS* and *Vax2OS*, however, indicate that some *OS* transcripts function in *trans*, and not by regulating expression of their neighboring protein-coding gene (Rapicavoli et al., 2011; Meola et al., 2012).

Other promoter-associated lncRNAs overlap proximal promoter sequences but are transcribed from the sense strand relative to the protein-coding gene. The transcription of the lncRNA itself can positively impact transcription in *cis* of the protein-coding gene, by changing chromatin conformation to permit transcription factor recruitment, leading to initiation of protein-coding gene transcription. Alternatively, promoter-associated lncRNAs can inhibit protein-coding gene transcription through one of two different proposed mechanisms. Chromatin de-condensation that occurs as a result of transcription of a lncRNA within the promoter region of a protein-coding gene may inhibit transcription of nearby genes by altering DNA supercoiling. Conversely, it was recently shown that transcription of the *CCND1* promoterassociated lncRNA (*CCND1-pncRNA*) recruits the TLS protein to the promoter of CCND1 during DNA damage. The recruitment of TLS reduces transcription of *CCND1* by inhibiting the histone acetyltransferase activity of CBP/p300 at the gene's promoter (Wang et al., 2008; Kurokawa, 2011). This further suggests that some promoter-associated lncRNAs may regulate transcription of neighboring protein-coding genes through recruitment of chromatin-modifying complexes.

Past efforts in comparative genetics have identified thousands of sequences that display high sequence constraints across evolution (Bejerano et al., 2004; Woolfe et al., 2005; Pennacchio et al., 2006). These ultra-conserved regions (UCRs) identified from *Fugu rubripes* and human (Woolfe et al., 2005; Pennacchio et al., 2006) or human, mouse, and rat (Bejerano et al., 2004) are at least 200 bp and display >90% sequence conservation. The UCRs tend to cluster around genes pertinent to the regulation of organism development. Therefore, the preferential localization and highdegree of sequence conservation has led to the hypothesis that these UCRs are vital to the regulation of development. Further studies analyzing these sequences have identified that many function as enhancer sequences (Woolfe et al., 2005; Pennacchio et al., 2006). However, in these studies, many UCRs also overlapped known expressed sequence tag (EST) transcripts that were rationalized as genomic contamination or incompletely spliced pre-mRNA (Bejerano et al., 2004; Woolfe et al., 2005). Roughly 240 (50%) and 84 UCRs (6%) showed evidence for transcription

in the Bejerano et al. (2004) and Woolfe et al. (2005) studies, respectively. Additional work on UCRs has since confirmed that these enhancers/UCRs can indeed be transcribed into non-coding sequence.

The discovery that many enhancers or ultra-conserved elements are not only platforms for transcription factor binding but also are transcribed themselves has stimulated studies of the role played by eRNA transcription in the regulation of neighboring genes (De Santa et al., 2010; Kim et al., 2010; Licastro et al., 2010; Wang et al., 2011). Most eRNAs are short sequences, resulting from bi-directional transcription of enhancer sequences. They exhibit H3K4me1-enriched sequences and lack poly-adenylation signals. One proposed mechanism of eRNA function in transcriptional regulation is a ripple effect, a process where growth factor-induced immediate-early gene transcription triggers initiation of transcription at nearby promoters (Ebisuya et al., 2008). One could postulate that eRNAs may function in a similar manner, with transcription factors binding to enhancers, recruiting the transcriptional machinery to the enhancer to induce eRNA transcription and chromatin modifications, leading to activation of neighboring genes. To date, however, no evidence is available to suggest such a mechanism exists for eRNA-dependent regulation of transcription. The expression of eRNAs generally correlates with activation of the neighboring gene(s; Lee, 2012; Li et al., 2013b; Melo et al., 2013; Memczak et al., 2013).

Examples exist, however, in which lncRNA sequence overlaps the ultraconserved enhancer sequence of neighboring genes (Feng et al., 2006). Transcription of lncRNA sequences from ultraconserved sequences, in sharp contrast to eRNAs, may actually inhibit antisense gene transcription of neighboring targets (Bond et al., 2009)*.* Analyses of lncRNAs that overlap ultraconserved element sequences have been shown to possess characteristics more similar to lincRNAs, and therefore, are not typically classified as eRNAs. These signatures include H3K4me3 and modification by 3- -polyadenylation. Although many of the transcribed ultra conserved elements overlap known enhancer sequences; only a minority (∼4%) are transcribed bi-directionally and are unlikely to encode short RNAs (Licastro et al., 2010). Together, this provides further indication that these lncRNAs are more similar to lincRNAs than eRNAs. One possible mechanism of function for lncRNAs that overlap ultraconserved enhancer regions comes from recent studies of the lncRNA *CCATT1-L*. *CCATT1-L* is expressed from a super enhancer region 515 kb upstream of the *MYC* locus and positively regulates *MYC* transcription by facilitating chromatin interactions between the *MYC* proximal promoter and enhancer elements (Xiang et al., 2014). This suggests that lncRNAs may facilitate transcription factor recruitment to specific DNA sequences, a potential mechanism discussed in further detail below.

Circular RNAs (circRNA) define a more unconventional and less well understood class of functional ncRNAs. These unique transcripts were originally identified in plants where they function to encode subviral components (Sanger et al., 1976). In animal species, these transcripts are thought to arise from joining of 5 and 3 splice sites within a single exon to form the circular transcript (Nigro et al., 1991; Capel et al., 1993; Cocquerelle et al., 1993; Chao et al., 1998; Burd et al., 2010; Hansen et al., 2011; Salzman et al., 2012). Recent profiling of mouse, human, and *Caenorhabditis elegans* identified thousands of conserved circR-NAs (Memczak et al., 2013). The identified circRNAs are often highly conserved, leading the authors to hypothesize that the circRNA transcripts function as molecular decoys for RNA-binding proteins and miRNAs (Memczak et al., 2013).

# **LincRNAs**

Other lncRNAs do not overlap with either protein coding genes or promoter or enhancer sequences. These are collectively termed long intergenic ncRNAs (lincRNAs). Analyses of the correlated expression patterns of lincRNA and transcripts of neighboring protein-coding genes imply that lincRNAs participate in similar biological processes to neighboring protein-coding genes (Luo et al., 2013). This has been interpreted that many lincR-NAs may function in *cis* to regulate expression of nearby genes (Luo et al., 2013). However, this finding also raises the possibility that lincRNAs might act in *trans* to directly or indirectly regulate the activity of co-expressed protein coding genes through RNA–protein interactions.

One classical example of lincRNA function comes from studies of *HOTAIR*. *HOTAIR* is transcribed from an intergenic region in the *HOXC* locus and is involved in recruitment of chromatin modifiers to hundreds of genomic loci (Rinn et al., 2007; Tsai et al., 2010; Chu et al., 2011). Through interactions with the PRC2 and LSD1 complexes, *HOTAIR* promotes H2K27 methylation and H3K4 demethylation, respectively, resulting in the leading to gene silencing (Rinn et al., 2007; Tsai et al., 2010; Chu et al., 2011). More specifically, *HOTAIR* expression silences expression of genes from the *HOXD* locus, thereby facilitating *HOXC* locus gene expression specifying positional identity of the *HOTAIR*-expressing cells (Rinn et al., 2007). Knockout of *Hotair* in mice causes skeletal defects including homeotic transformation of vertebrae resulting from de-repression of multiple *HoxD* cluster genes, increased expression of ∼30 genes from imprinted loci and loss of vertebral boundary specification during development (Li et al., 2013a).

# **COMMON FUNCTIONAL PROPERTIES OF lncRNA-DEPENDENT TRANSCRIPTIONAL REGULATION**

As previously stated, many lncRNAs have been proposed to function through interactions with chromatin modifiers. In fact, it is estimated that ∼30% of all lincRNAs expressed in mouse ES cells interact with one or more of 11 particular chromatin modifiers (Khalil et al., 2009; Guttman et al., 2011). This has been extrapolated to suggest that interaction with chromatin regulators is the major mechanism by which lncRNAs regulate transcription. However, many of these lncRNAs display predominantly cytoplasmic expression, suggesting instead that they may have additional cellular functions. Furthermore, there is reason to suspect that the selectivity of lncRNA–chromatin modifier interactions may have been overestimated. Experiments using overexpressed, tagged lncRNAs followed by mass spectrometry do not take into account the low transcript abundance levels seen for most lncRNAs. Chromatin-modifying enzymes are likewise abundantly expressed in virtually all cell types, particularly in comparison to transcription factors. It is thus possible that weak and possibly non-physiological interactions between lncRNAs and chromatin-modifying proteins may be detected using mass spectrometry. This may include weak interactions of highly expressed proteins that have known RNA binding potential, such as PRC2 complex proteins. Furthermore, recent reports suggest that the PRC2 protein complex is quite promiscuous in its RNA binding specificity (Davidovich et al., 2013). A more systematic interrogation of potential lncRNA–protein interactions using techniques that control for the abundance of both lncRNA and protein, such as protein microarrays, will help clarify this issue.

# **LncRNAs AS MOLECULAR SCAFFOLDS FOR ORGANIZING TRANSCRIPTION AND SIGNALING**

The characterization individual lncRNAs suggest that lncRNAs may function to serve as molecular scaffolds (**Figure 2**). Aptamer selection experiments reveal that it is relatively easy to evolve RNAs that show moderate binding affinity to a broad range of substrates, including proteins and small molecules, and demonstrate that aptamer–protein interactions show far less constraint at the level of primary sequence than do protein–protein interactions (Wilson and Szostak, 1999; Kang and Lee, 2013). In combination with homologous Watson–Crick base pairing, which provides a ready means by which RNA can selectively interact with other nucleic acid targets, this allows lncRNAs to act as molecular hubs that facilitate assembly of macromolecular complexes that can include proteins, DNA, and other RNAs.

If secondary structure primarily underlies lncRNA–protein interactions, as implied by aptamer studies, conventional sequence alignment software may not be optimal for identifying functional lncRNAs. Indeed, recent reports suggest that >20% of the human RNAs display evolutionarily conserved secondary structures independent of primary sequences (Smith et al., 2013). Reports analyzing interactions of the lncRNA *Xist, RepA*, or other

#### **FIGURE 2 | LncRNA regulation of transcription and translation by acting as scaffolds to facilitate interactions between**

**macromolecules.** Schematic examples of how lncRNAs participate in RNA–DNA, RNA–RNA, and RNA–protein interactions to facilitate the regulated expression of protein coding genes. **(A)** LncRNAs like Six3OS interact with chromatin-modifying complexes to regulate gene transcription. Additionally, lncRNAs can interact with transcription factors to facilitate target gene expression. **(B)** Complementary sequence on lncRNAs with enhancer sequences is proposed to enable chromatin looping to regulate gene transcription. **(C)** Expression of lncRNAs that

contain repeat sequences for protein binding help facilitate co-regulated transcription of multiple targets, including transcription across different chromosomes. **(D)** LncRNAs are implicated in the formation and maintenance of nuclear paraspeckles that facilitate alternative splicing events of nascent transcripts. **(E)** Through homologous base-pairing with mRNA transcripts and interactions with ribosomal proteins and/or RNAs, lncRNAs are able to target mRNAs to the ribosomes. **(F)** Containing complementary target sequences, lncRNAs also serve as miRNA decoys to prevent interactions of miRNAs with protein-coding transcripts.

short ncRNAs suggest that a double stem–loop structure is sufficient for PRC2 binding (Zhao et al., 2008; Kanhere et al., 2010). The presence of short repeats within lncRNAs that display conserved secondary structure can then facilitate protein recruitment to the regions where the lncRNA is localized. This has been recently exemplified by the lncRNA *Firre*, which contains repeat domains for nuclear matrix factor hnRNPU binding (Hacisuleyman et al., 2014). In serving as a scaffold, *Firre* is thought to mediate intra-chromosomal bridging and focalized transcription of *Firre*-regulated targets (Hacisuleyman et al., 2014).

Further evidence of lncRNAs serving as molecular scaffolds comes from studies analyzing lncRNA co-localization with the nuclear paraspeckles, domains that are thought to be locations of retained RNAs where alternative splicing events are regulated (reviewed in Spector and Lamond, 2011). The highly expressed nuclear lncRNAs *Neat1* and *Malat1* both localize to these nuclear subdomains (Clemson et al., 2009; Tripathi et al., 2010). The paraspeckle domains are thought to be locations of retained RNAs where alternative splicing events are regulated (reviewed in Spector and Lamond, 2011). *Neat1* induces paraspeckle formation and *Malat1* recruits splicing factors to these domains (Clemson et al., 2009; Tripathi et al., 2010). Through both RNA– RNA interactions and RNA–protein interactions, these lncRNAs are thus implicated in regulating splicing.

Analysis of the lncRNA *Hotair* suggests that lncRNAs can also regulate post-transcriptional processes. *Hotair* associates with the RNA-binding and ubiquitin ligase proteins Dzip3 and Mex3b (Yoon et al., 2013). Additionally, *Hotair* binds the ubiquitin ligase substrates Ataxin-1 and Snurportin-1, thereby facilitating interaction of the proteins and ubiquitin-dependent degradation of Ataxin-1 and Snurportin-1 (Yoon et al., 2013). Additional studies like these are required to address the functions of the multitude of lncRNAs that are expressed in the cytoplasm and that do not directly regulate chromatin modifications and gene transcription (van Heesch et al., 2014).

#### **LncRNAs IN THE DEVELOPING NERVOUS SYSTEM**

Transcript expression analyses within the nervous system have shown an abundance of lncRNAs that display spatially restricted and temporally dynamic expression (Blackshaw et al., 2004; Mehler and Mattick, 2007; Mercer et al., 2008; Aprea et al., 2013; Luo et al., 2013; Lv et al., 2013). In fact, lncRNAs generally display more tissue specificity than protein-coding genes (Luo et al., 2013). The spatial and temporal regulation of lncRNAs is therefore hypothesized to promote neuronal diversification and specification. Indeed, comparative analyses of sequences from human and chimpanzee brains identified non-coding HARs that display fast evolution and are correlated with human-specific brain functions (Pollard et al., 2006). The HARs and many other lncR-NAs display preferential genomic localization near protein-coding genes involved in neurodevelopment and are proposed to function through *cis-*regulation of the locus (Dinger et al., 2008; Luo et al., 2013), further implicating the requirement of lncRNA function in neurodevelopment. In addition, biological significance of lncRNAs in the developing nervous system is beginning to be understood through both loss- and gain-of-function experiments

analyzing individual lncRNAs. Information regarding the identity and function of lncRNAs expressed in the developing central nervous system (CNS) is summarized in **Table 1**.

#### **INSIGHTS FROM CONTROLLED DIFFERENTIATION OF ES CELLS**

Recent studies have focused on the identification of lncRNAs expressed during neuronal differentiation, either in stem cells or *in vivo.* The rationale behind these studies suggests that the identification of lncRNAs that display dynamic expression across developmental stages can be extrapolated to lncRNA participation in differentiation. For example, expression profiling of embryoid body (EB) differentiation of mouse embryonic stem cells (ES) revealed 174 lncRNAs that displayed differential expression patterns (Dinger et al., 2008). Consistent with previous reports on protein-coding gene expression in pluripotent cells (Ivanova et al., 2002; Ramalho-Santos et al., 2002; Bruce et al., 2007; Dinger et al., 2008), twice as many lncRNAs were expressed during pluripotent stages versus more committed lineages (Dinger et al., 2008). Overall, 12, 7, and 31 lncRNAs displayed dynamic expression patterns consistent with pluripotency, primitive streak formation/gastrulation and hematopoiesis, respectively, with many lncRNAs displaying expression patterns with positive correlations to neighboring protein-coding genes (Dinger et al., 2008). Further reports have identified 226 lncRNAs expressed in pluripotent ES cells (Guttman et al., 2009, 2010), 137 of which were knocked-down and showed a significant impact on ES cell gene expression (Guttman et al., 2011). Importantly, loss-of-function studies indicated that 26 of these lncRNAs function to maintain ES cell pluripotency (Guttman et al., 2011). In both studies, many identified lncRNAs were proposed to regulate gene transcription through identified RNA–protein interactions of lncRNA and protein components of chromatin-modifying complexes (Dinger et al., 2008; Guttman et al., 2011). The importance of lncRNAs in pluripotency was further confirmed through observations where two lncRNAs, themselves transcriptional targets of Oct4 and Nanog, regulate pluripotency through a feedback-loop regulating *Oct4* and *Nanog* transcript expression (Sheik Mohamed et al., 2010).

Additional studies have more specifically characterized the requirement of lncRNAs in neural and oligodendrocyte induction (Mercer et al., 2010; Ng et al., 2012). In comparing neural progenitor cells differentiated from human ES cells, Ng et al. (2012) observed 934 of 6671 lncRNAs that displayed differential expression by microarray analysis. Similar to previous reports in mouse ES cells, 36 lncRNAs displayed expression patterns consistent with regulation of pluripotency, three of which were experimentally shown to regulate pluripotency through knockdown studies and contained OCT4- and NANOG-binding sites in their proximal promoter. Further characterization through RNA immunoprecipitation (RIP) indicated that two lncRNAs interacted directly with the pluripotency transcription factor SOX2 and the PRC2 chromatin-modifying complex component SUZ12 (Ng et al., 2012). In these studies, 35 lncRNAs displayed expression patterns consistent with a role in neural induction, four of which were studied and shown to be required for proper neural differentiation. Of these four lncRNAs, one (AK055040) was shown to interact with SUZ12, indicating a functional role in chromatin

#### **Table 1 | LncRNAs in neurodevelopment and neurodevelopmental disorders.**


(Continued)

#### **Table 1 | Continued**


modifications in the regulation of neurogenesis. An additional lncRNA (AK124684) was found to interact with the transcriptional factor REST (Ng et al., 2012), a master negative regulator of neurogenesis that binds to the promoters of neurogenic genes to inhibit gene transcription (Ballas et al., 2005; Abrajano et al., 2009; Gao et al., 2011). A third lncRNA (AK091713) was subsequently shown to contain miRNAs miR-125b and let-7a within its intronic sequence, thereby driving neurogenesis through the expression of neurogenic miRNAs (Rybak et al., 2008; Le et al., 2009; Ng et al., 2012). Other studies identified that the lncRNAs *Six3OS* and *Dlx1AS* are required for directed differentiation of pluripotent stem cells towards a neuronal precursor identity (Ramos et al., 2013).

Among lncRNAs found to regulate neurogenesis, the lincRNA *RMST* was targeted for additional follow-up studies. *RMST* in humans is located ∼150 kb away from the closest annotated protein-coding gene (Ng et al., 2012). The promoter of *RMST* contains REST binding sites and is occupied by REST, suggesting that *RMST* is activated during neurogenesis through dissociation of REST from the promoter (Ng et al., 2013). Analysis of *RMST* revealed that *RMST* promotes neurogenesis through inhibition of glial fates (Ng et al., 2013). RNA pull-down experiments indicated that *RMST* interacts with the RNA binding protein hnRNPA2/B1 and SOX2, both of which are also required for neuronal differentiation (Ng et al., 2013). Ultimately, it was observed that *RMST* regulates neuronal differentiation through directing SOX2 to the promoter of neurogenic transcription factors, to promote neurogenic gene expression and neuralfate commitment (Ng et al., 2013). *RMST* does not bind REST or the PRC2 chromatinmodifying complex protein SUZ12 (Ng et al., 2012). Using both RIP and chromatin isolation by RNA purification (ChiRP) to identify DNA-binding sites of lncRNAs (Chu et al., 2011), the

researchers provided evidence that *RMST* binds to promoters of Sox2 target genes, and activates transcription of these genes by recruiting Sox2 (Ng et al., 2013). The mechanism by which *RMST* is recruited to Sox2 consensus binding sites is unclear, but is postulated to occur through homologous base pairing that leads to the formation of RNA–DNA hybrids (Ng et al., 2013). If this is the case, this may turn out to be a more general mechanism by which *trans*-acting lncRNAs regulate gene expression.

Similar to *RMST*, *utNgn1*, and *Sox2dot* display expression profiles that positively correlate with differentiation of neural progenitors (Amaral et al., 2009; Onoguchi et al., 2012). Importantly, both of these lncRNAs overlap sequences of ultra conserved elements implicated in neuronal development (Amaral et al., 2009; Onoguchi et al., 2012). *UtNgn1* is required for *Neurogenin1* (*Neurog1*) transcription and PRC2-mediated repressive signals at the *utNgn1* locus are associated with both decreases in *utNgn1* and *Neurog1* transcript abundance (Onoguchi et al., 2012). Inhibition of *utNgn1* expression during mouse cortical progenitor differentiation resulted in decreased expression of neurogenic markers, consistent with a role of *utNgn1* in promoting neurogenesis through activation of *Neurog1* transcription (Onoguchi et al., 2012). The exact mechanism by which transcription of *utNgn1* at the Neurog1 enhancer mediates *Neurog1* transcription remains elusive. Similarly, expression of *Sox2dot* in the neurogenic regions of the brain suggests that it functions to regulate neural development (Amaral et al., 2009). The function of *Sox2dot* in neural development remains to be investigated.

More recent experiments have identified 20 additional lncR-NAs that regulate pluripotency. In particular, the lncRNA *TUNA*, was shown to be highly conserved among vertebrates and was expressed within the developing nervous system (Lin et al., 2014). *TUNA* was shown to regulate pluripotency through the

binding of three RNA binding proteins and co-occupancy of the RNA–protein complex at the promoters of *Sox2*, *Nanog*, and *Fgf4*. Inhibition of *TUNA*resulted in the decreased capacity of mESCs to differentiate to neural lineages (Lin et al., 2014). Consistent with a role in regulating neural development, *TUNA* expression is correlated with Huntington's disease (HD) prognosis, and inhibition of *TUNA* in zebrafish results in locomotor defects (Lin et al., 2014).

### **CONTROL OF NEURAL DEVELOPMENT** *IN VIVO* **BY LncRNAs** *LncRNAs in retinal development*

While the identification and validation of the function of lncR-NAs during neuroectodermal differentiation from cultured ES cells have provided a wealth of information regarding which lncR-NAs to target, relatively few studies have begun to examine the role of individual lncRNAs *in vivo* during neurodevelopment. To date, many examples of lncRNA function *in vivo* come from studies analyzing the role of ncRNAs during retinal development (reviewed in Rapicavoli and Blackshaw, 2009). Specifically, four lncRNAs have been implicated in regulating cell fate decisions during retinal development. *Tug1* was identified in a screen to characterize genes that display enhanced expression in response to taurine, which promotes rod photoreceptor differentiation (Altshuler et al., 1993;Young et al., 2005). *Tug1* knock-down experiments displayed abnormal morphology of inner and outer segments of photoreceptors, accompanied by increased cell death and an increase in the percentage of electroporated cells expressing the conephotoreceptor marker peanut agglutinin (PNA;Young et al.,2005). Studies analyzing the interactions of lincRNAs with chromatinmodifying complexes identified an association between *TUG1* and the PRC2 complex (Khalil et al., 2009). Further characterization of *Tug1* revealed that it is activated in a p53-dependent manner, and loss of *Tug1* results in the up-regulation of ∼120 genes, most of which are genes involved in cell-cycle regulation (Guttman et al., 2009; Khalil et al., 2009). Combined, these results indicated that *Tug1* functions to promote rod genesis through inhibition of cone photoreceptor cell fate through its interactions with repressed chromatin (Young et al., 2005; Khalil et al., 2009). However, only a subset of cellular *Tug1* RNA is localized to the nucleus, suggesting that other mechanisms of *TUG1* function may exist (Khalil et al., 2009).

The lncRNA*Vax2os* has been shown to display predominately retinal expression, specifically at post-natal periods during mouse development (Alfano et al., 2005). *Vax2os1* was also found to regulate mouse photoreceptor differentiation (Meola et al., 2012). *Vax2os1* is endogenously expressed in the ventral retina of mice, primarily localizing to the outer neuroblastic layer of the developing retina. Overexpression of *Vax2os1* increases the proportions of proliferating cells in the dorsal retina (low endogenous expression of *Vax2os1*) through perturbation of cell cycle progression in neural progenitors (Meola et al., 2012). The increase in proliferative progenitors and increased apoptosis in*Vax2os1* overexpressing cells resulted in a decrease of photoreceptor differentiation (Meola et al., 2012).

The lncOST *Six3os* is co-expressed with the homeodomain transcription factor *Six3* in retinal progenitor cells (Blackshaw et al., 2004). *Six3os* is juxtaposed to *Six3* and transcribed in the opposite direction of *Six3* in both mouse and human (Alfano et al., 2005; Rapicavoli et al., 2011)*. Six3os*, however, does not regulate *Six3* transcription. The *Six3os* transcript forms an RNA– protein complex with transcriptional co-regulators of Six3 such as Eya1, but not with Six3 itself, suggesting that *Six3os* controls expression of Six3 target genes (Alfano et al., 2005; Rapicavoli et al., 2011). Furthermore, *Six3os* interacts with the Ezh2 component of the PRC2 complex (Zhao et al., 2010; Rapicavoli et al., 2011), suggesting that *Six3os* may function to repress Six3 targets by triggering H3K27me3 modification. This is further supported by experiments in which *Six3os* overexpression blocked changes in retinal cell fate induced by *Six3* overexpression (Rapicavoli et al., 2011). Inhibition of *Six3os* expression resulted in a decrease in rod bipolar cells with a concomitant increase in Müller glial cell number (Rapicavoli et al., 2011). This phenotype is similar to loss of function of *Six3* alone (Zhu et al., 2002; Rapicavoli et al., 2011).

Another lncRNA that is also prominently expressed in the retina and has recently been functionally characterized is *Gomafu* (also known as *RNCR2* and *Miat*). *Gomafu* is one of the most abundant polyadenylated RNAs found in the neonatal retina (Blackshaw et al., 2004), and is expressed widely throughout the nervous system, displaying nuclear localization to a novel nuclear domain within neural precursors (Ishii et al., 2006; Sone et al., 2007; Chen and Carmichael, 2010). Overexpression of *Gomafu* in the developing retina had no effect on retinal development, presumably due to the already high abundance levels of *Gomafu* transcript (Rapicavoli et al., 2010). Inhibition of *Gomafu* expression/function resulted in an increase in amacrine and Müller glial cells in the developing retina, suggesting that *Gomafu* negatively regulates amacrine cell fate specification and delays Müller glial cell specification (Rapicavoli et al., 2010). Additional studies on *Gomafu* revealed that it selectively bound splicing regulators such as SF1 and Qk, and that its loss of function disrupted splicing of a subset of neuronal pre-mRNAs (Tsuiji et al., 2011; Barry et al., 2013). However, the mechanism by which *Gomafu*-dependent mRNA splicing affects amacrine and Müller glial cell specification remains elusive. Since many other lncRNAs are prominently expressed in the developing retina (Blackshaw et al., 2004), further studies will undoubtedly identify further instances in which lncRNAs regulate the expression and/or activity of protein-coding genes essential for retinal development.

#### *LncRNAs that regulate development of other CNS regions*

Although the study of lncRNAs in other regions of the developing CNS has lagged behind studies in retina until recently, this is now rapidly changing. At least a half-dozen lncRNAs have now been functionally characterized in developing brain. Several examples of functional lncRNAs have been identified through analysis of the transcriptional control of GABAergic interneuron specification. During development, multipotent progenitors that give rise to both GABAergic interneurons and oligodendrocytes are generated from the medial and caudal ganglionic eminences of the ventral telencephalon (Anderson et al., 1997; Panganiban and Rubenstein, 2002; Yung et al., 2002). *In vitro* differentiation of embryonic forebrain-derived neural stem cells identified a host of additional lncRNAs dynamically expressed during GABAergic interneuron specification (Mercer et al., 2010), including two lncRNAs that

overlap ultraconserved enhancers of the DLX family of proteins, *Dlx1AS* and *Evf2*.

*Evf2* is partially transcribed from an ultra-conserved enhancer sequence (ei) located between the convergently transcribed *Dlx5* and *Dlx6* genes (Feng et al., 2006). *Evf2* is transcribed in the antisense direction to Dlx6, with the entire sequence for *Dlx6* localized within intron 2 of *Evf2* (Feng et al., 2006). Transcription of *Evf2* results in the recruitment of Dlx1/2 and MECP2 transcription factors to the *Dlx5/6* enhancers to regulate *Dlx5/6* transcription (Feng et al., 2006; Bond et al., 2009). Loss of *Evf2* results in an increase in *Dlx6* transcript abundance, a phenotype that cannot be rescued with *Evf2* overexpression, suggesting that transcription of *Evf2* inhibits activation of *Dlx6* transcription in *cis* through oppositestrand inhibition (Bond et al., 2009). Further studies indicate that *Evf2 trans* activity inhibits the ei enhancer methylation (Berghoff et al., 2013). Altogether, *Evf2* functions in both *cis* and *trans* to regulate transcription of Dlx5/6 and chromatin status of the ei ultra-conserved enhancer.

*Dlx1AS* is localized in the *Dlx1/2* locus similar to *Evf2* in the *Dlx5/6* locus, such that *Dlx1AS* overlaps the conserved enhancer between the convergently transcribed *Dlx1/2* genes (Dinger et al., 2008). In contrast to the genomic architecture of *Evf2*, exon 2 of *Dlx1AS* overlaps the *Dlx1* coding sequence in the antisense orientation, suggesting *Dlx1AS* may also function as a NAT (Kraus et al., 2013). Genetic loss of *Dlx1AS* results in increased Dlx1 expression, suggesting a negative regulation of *Dlx1* by *Dlx1AS*, potentially through antisense inhibition (Kraus et al., 2013). These reports suggest that lncRNAs transcribed from ultra conserved sequences can function through molecular mechanisms shared with other classes of lncRNAs. They may also control activity and/or recruitment of transcriptionfactors at enhancers through dosage or allelic differences in lncRNA abundance, adding an additional layer of complexity to enhancer-mediated gene regulation (Amaral et al., 2009).

In order to study loss of function of *Dlx1AS* and *Evf2 in vivo*, homologous recombination was used to insert premature polyadenylation sequences in both lncRNAs, as genomic deletion of either *Dlx1AS* or *Evf2* would alter expression or affect primary sequence of neighboring protein-coding genes (Bond et al., 2009; Kraus et al., 2013).

Insertion of the transcriptional terminator sequence in the *Evf2* locus results in a significant, but incomplete loss of lncRNA transcript expression, likely resulting in a hypomorphic phenotype. Loss of *Evf2* results in an early decrease in GABAergic neurons in the hippocampus and dentate gyrus in juvenile mice (Bond et al., 2009). Although the deficit in GABAergic neuron number is recovered in in adult mice, loss of *Evf2* results in reduced inhibition of CA1 pyramidal neurons, likely the result of synaptic defects from reduced Gad1 levels (Bond et al., 2009).

Similarly, in addition to mild defects resulting in craniofacial anomalies, loss of *Dlx1AS* also affects the number of hippocampal interneurons (Kraus et al., 2013). Loss of *Dlx1AS* results in increased interneuron number, likely due to an increase in *Dlx1* expression that triggers a corresponding increase of expression of *Mash1* (Kraus et al., 2013). Similar to *Evf2* studies, early changes in interneuron number are not maintained into adulthood in *Dlx1AS* mice, suggesting compensatory mechanisms regulating

proper number of neurons (Kraus et al., 2013). Combined with the observations of decreased *Olig2* expression in *Dlx1AS* mutant mice, *Evf2* and *Dlx1AS* may be functioning to control levels of the Dlx protein family to generate the proper proportion of oligodendrocytes and GABAergic neurons generated from the bipotent precursor (Bond et al., 2009; Kraus et al., 2013). Other studies have indicated that lncRNAs can play a pivotal role in controlling neural versus oligodendrocyte fate decisions. This includes studies in which *Nkx2.2AS* was overexpressed in ventral telencephalic progenitors and was observed to drive oligodendrocyte specification, possibly by increasing Nkx2.2 levels (Tochitani and Hayashizaki, 2008).

Studies in zebrafish have examined conserved lincRNAs that display short sequences of high homology across evolution and syntenic genomic localization (Ulitsky et al., 2011). The lncR-NAs *cyrano* and *megamind* are highly expressed throughout the developing nervous system. Morpholino knockdown of *cyrano* and *megamind* results in zebrafish with reduced brain and eye size (Ulitsky et al., 2011). Additional phenotypes include neural tube closing defects and reduced accumulation of the NeuroD-GFP positive neurons in the developing eyes and brain (Ulitsky et al., 2011). In examining the evolutionary conservation of function of lncRNAs, the researchers showed that the syntenic mouse and human lncRNAs could partially rescue the observed phenotypes from *megamind* inhibition. Additionally, the rescue using mouse and human orthologs was dependent on expression of the evolutionarily conserved sequence (Ulitsky et al., 2011). Interestingly, the conserved sequence of *cyrano* was not sufficient to rescue decreased *cyrano* expression (Ulitsky et al., 2011). The conserved sequence of *cyrano*, however, matched the consensus binding sequence of miR-7, suggesting regulation of *cyrano* by miR-7, or conversely, *cyrano* functioning as a miRNA decoy (Ulitsky et al., 2011).

Similar to *cyrano*, the circRNA *CDR1as* also serves as a miR-7 decoy. *CDR1as* is highly conserved amongst mammals and contains 63-consensus miR-7 binding sites conserved among two or more species (Memczak et al., 2013). *CDR1as* is an antisense transcript to the*CDR1* coding sequence and shares a similar expression pattern to miR-7 during brain development. Over-expression of the human *CDR1as* in zebrafish, which have lost the entire CDR1 locus, results in a decreased size of the midbrain, similar to miR-7 loss-of-function (Memczak et al., 2013). Together, these data suggest that *CDR1as* acts as an endogenous "sponge" that attenuates the action of miR-7 on protein coding mRNAs through competitive binding.

Like *Six3os* in retina*,*recent experiments examining the lncRNA *Paupar* have uncovered another instance of a lncRNA that cooperates with the neighboring protein-coding gene to regulate transcription (Vance et al., 2014). *Paurpar* is localized ∼8.5 kb upstream of the homeodomain factor *Pax6*, which regulates many different aspects of CNS development. Interestingly, *Paupar* is localized within the first intron of the ncRNA *Pax6os1*, and is generally coexpressed with *Pax6* mRNA. However, *Paupar* inhibition results in an increase in *Pax6* expression. Comparing changes in gene expression seen following knockdown of *Paupar* and *Pax6* revealed many genes that showed similar changes in expression, indicating that while *Paupar* regulates expression of *Pax6* itself, *Paupar* is also likely to participate in the regulation of Pax6 target genes (Vance et al., 2014). Using capture hybridization analysis of RNA targets (CHART), the researchers found that *Paupar* occupied >2500 genomic sites, localizing to the promoters of many genes involved in stem cell maintenance and neuronal development (Vance et al., 2014). Further characterization indicated that Paupar and Pax6 co-occupy 71 different genomic loci, suggesting that both directly co-regulate transcription of these genes (Vance et al., 2014). It remains to be determined, however, if Paupar and Pax6 physically associate to regulate target genes. It will also be important to examine the Pax6-independent functions of *Paupar* as a majority of the genomic binding sites of Paupa*r* are not co-occupied by Pax6.

Recently, a small consortium has targeted multiple lincRNAs for genetic deletion and begun reporting phenotypic analyses (Sauvageau et al., 2013). In their studies, seven of the 18 lincR-NAs targeted for knockout were shown to have human orthologs that were dynamically expressed during directed neuronal differentiation of ES cells (Sauvageau et al., 2013). In particular, the deletion of the lincRNA *linc-Brn1b* was analyzed. *Linc-Brn1b* is localized less than 10 kb downstream of the *Brn1* gene, and is transcribed from the OS of *Brn1* (Sauvageau et al., 2013). Deletion of *linc-Brn1b* results in mice with reduced *Brn1* transcript abundance. These mutants display features similar to Brn1/Brn2 double knockouts, including reduced proliferation of intermediate progenitors within the sub-ventricular zone (SVZ) of the dorsal telencephalon, reduced production of upper layer cortical neurons and a reduction in total size of the barrel cortex (Sauvageau et al., 2013). As *linc-Brn1b* was completely excised in the knockout studies and the phenotypes mimic some features of Brn1 knockouts, the possibility exists that the observed phenotypes are partially the result of decreased *Brn1* expression due to a lost enhancer sequence within *linc-Brn1b* (Sauvageau et al., 2013). Further characterization of *linc-Brn1b* and other lincRNA knockout lines generated in these studies will continue to elicit the importance of lncRNAs in neuronal development.

### **LncRNAs IN DISORDERS OF THE NERVOUS SYSTEM**

Many groups are taking advantage of RNA-Seq and lncRNA microarray technologies to identify altered transcript expression levels between control and diseased states within various human neurological and psychiatric disorders (Dharap et al., 2012, 2013; Petazzi et al., 2013; Ziats and Rennert, 2013). While useful, with few exceptions, these studies have not functionally implicated these lncRNAs in disease progression (Petazzi et al., 2013; Ziats and Rennert, 2013). Here we review the limited number of studies that directly link altered lncRNA function to the development and progression of neurological disease.

One of the better studied lncRNAs associated with human disease is *ANRIL* (also known as *CDKN2B-AS*). Genome-wide association studies have associated numerous polymorphisms on human chromosome 9p21 that segregate with diseases including cardiovascular disease, Type-2 diabetes, Alzheimer's disease (AD), primary open angle glaucoma, endometriosis, periodontitis, and several cancers (reviewed in Congrains et al., 2013). Polymorphisms map to both the promoter and transcribed region of

*ANRIL*, including many transcription factor-binding sites located throughout the locus. *ANRIL* has been shown to bind CBX7 and SUZ12 of the PRC1 and PRC2 complexes, respectively, to regulate the histone modification status of the nearby *CDKN2A* and *CDKN2B* genes (Yap et al., 2010; Aguilo et al., 2011; Kotake et al., 2011). As both increased and decreased *ANRIL* expression levels correlate with disease states (Congrains et al., 2013), the fine control of *CDKN2B/CDKN2A* transcript abundance seems paramount to normal development.

*Kcna2AS* is an antisense ncRNA to the voltage-gated potassium channel Kcna2. Expression of *Kcna2AS* was observed in dorsal root ganglia (DRG) and was expressed at higher levels in ganglia exhibiting lower levels of Kcna2 protein expression, or after spinal nerve injury (Zhao et al., 2013). Spinal nerve injury causes an increase of myeloid zinc finger protein 1 (MZF1) binding to the proximal promoter of *Kcna2AS*, causing an increased expression of *Kcna2AS* with a concomitant decrease in Kcna2 transcript and protein abundance (Zhao et al., 2013). Additional experiments found that expression of *Kcna2AS* causes a decrease in voltage-gated potassium currents and an increase in membrane resting potential, suggesting that pain hypersensitivity or neuropathic pain can be caused by altered *Kcna2AS* levels (Zhao et al., 2013).

Recent studies have also implicated altered lncRNA expression as associated with AD progression. AD is characterized by a progressive neurodegeneration that leads to memory and cognitive impairment. A hallmark component of the pathological condition is the buildup of extracellular beta amyloidal plaques. The amyloid precursor protein (APP) is cleaved in the initial and ratelimiting step by β-secretase enzyme (BACE1) toform the amyloid β precursor proteins Aβ 1–40 and Aβ 1–42. In pathological conditions, the Aβ 1–42 proteins oligomerize and contribute to the plaques that participate in AD (Abramov et al., 2004; Ohyagi et al., 2005; Snyder et al., 2005; Esposito et al., 2006; Zhu et al., 2006; Lacor et al., 2007; Matsuyama et al., 2007). As a result, it has been suggested that BACE1 misregulation can contribute to excess Aβ 1–42 protein production and the development of amyloid plaques. Recent work has identified an antisense transcript to BACE1 (*BACE1-AS*) that encodes a conserved∼2 kb lncRNA with a 104 bp overlap with the human *BACE1* transcript (Faghihi et al., 2008). Both overexpression and knockdown experiments indicated that *BACE1-AS* is a positive regulator of *BACE1* transcript and protein abundance (Faghihi et al., 2008). Mechanistically, *BACE1-AS* stabilizes the *BACE1* transcript, protecting it from RNA degradation through RNA–RNA pairing of the *BACE1-AS* and *BACE1* homologous regions (Faghihi et al., 2008). Importantly, *BACE1- AS* and *BACE1* transcripts were induced by many cell stressors that are implicated in the initiation of AD, suggesting a direct mechanism by which cell stress can lead to increased Aβ precursor protein production (Faghihi et al., 2008). The importance of *BACE1-AS* in AD was further supported through examinations of primary tissues from multiple brain regions, where *BACE1- AS* transcript abundance was elevated twofold in confirmed AD patient brain samples compared to age- and sex-matched controls (Faghihi et al., 2008). Further characterization of *BACE1-AS* in a transgenic mouse model of AD indicated that *BACE1- AS* inhibition reduces the insoluble fraction of Aβ 1–40 and

Aβ 1–42 precursor proteins (Modarresi et al., 2011), suggesting that increased *BACE1-AS* expression does directly contribute to AD pathology.

Other aspects of AD are also potentially regulated through lncRNA function. Recent work on neurotrophin levels in diseases of the brain have indicated that reduced neurotrophin levels (BNDF and glial derived neurotrophic factor – GDNF) correlate with the onset of neurodegenerative disorders such as Parkinson's disease, AD, and HD (reviewed in Allen et al., 2013). This has led to potential therapeutics aimed at increasing neurotrophin levels (Weinreb et al., 2007; Straten et al., 2011; Allen et al., 2013). However, as both BDNF and GDNF display complex splicing regulation (Airavaara et al., 2011; Modarresi et al., 2012), other mechanisms of therapeutic intervention than exogenous neurotrophin replacement may be better suited to treating the diseases. Interestingly, both BDNF and GDNF have corresponding anti-sense or OS transcripts (*BDNF-AS* and *GDNF-OS*), however, one of the three *GDNF-OS* transcripts is likely protein coding (Airavaara et al., 2011). Knockdown of either *BDNF-AS* or *GNDF-OS* results in an increase in corresponding protein-coding gene transcript abundance, implying that these lncRNAs negatively regulate neurotrophin expression (Modarresi et al., 2012). Further characterization of *BDNF-AS* indicates that *BDNF-AS* recruits EZH2 and the PRC2 complex to the BDNF promoter to repress *BDNF* transcription through H3K27me3 histone modifications (Modarresi et al., 2012). Combined with studies in which treatment with exogenous BDNF rescued HD phenotypes in mice (Xie et al., 2010), these experiments suggest that inhibition of neurotrophin antisense transcripts may provide a novel target for treatment of neurodegenerative disease.

LncRNAs have also been implicated in nervous system disorders through their role in pre-mRNA splicing. The lncRNAs *Gomafu* and *Malat1* are both highly expressed in the nervous system and regulate splicing through interactions with splicingfactors (Sone et al., 2007; Tripathi et al., 2010; Tsuiji et al., 2011; Zong et al., 2011). Interestingly, aberrant splicing of the genes DISC1 and ERBB4, among others, is associated with disease pathology in schizophrenia (SZ; Law et al., 2007; Nakata et al., 2009; Morikawa and Manabe, 2010). Additionally, the *Gomafu*-bound splicing factor QKI is downregulated in SZ brains and is proposed to contribute to disease pathology (Aberg et al., 2006a,b; Haroutunian et al., 2006; McCullumsmith et al., 2007). Recently, *Gomafu* has been shown to interact with multiple splicing factors, including a strong interaction with QKI (Barry et al., 2013). *Gomafu* expression is also significantly decreased shortly after neuronal depolarization in the cortical neurons in mice, and in human induced pluripotent stem cell (iPSC)-derived neurons (Barry et al., 2013). Combined with GWAS studies linking *Gomafu* with eye movement disorders in SZ (Takahashi et al., 2003), this led to the hypothesis that loss of function of *Gomafu* may directly contribute to SZ disease pathology. Indeed,*Gomafu* is significantly reduced in superior temporal gyrus of SZ brain samples compared to controls (Barry et al., 2013). Knockdown of *Gomafu* in iPSC neurons also results in an increase in rare splice variants of *DISC1* and *ERBB4* (Barry et al., 2013), matching splicing patterns observed *in vivo* from human SZ brains (Law et al., 2007; Nakata et al., 2009).

With the increased use of whole exome sequencing and copy number variations (CNV) for genetic analysis of patients with neurological diseases, our understanding of the importance of lncRNAs in neurodevelopment will only be further increased. For example, one patient that displayed a cognitive developmental delay possessed a chromosomal translocation that affected *linc00299* (Talkowski et al., 2012). Further examinations of patient databases identified an additional four patients that displayed developmental delay and disruption of the *linc00299* locus (Talkowski et al., 2012), suggesting that *linc00299* is vital for proper neuronal development. Further characterization of lncRNA function in animal models and *in vitro* will continue to expand our knowledge on the importance of lncRNAs in both human development and disease.

# **CONCLUSION**

Advances in sequencing technologies and the appreciation of functional non-coding elements have resulted in the rapid identification of a plethora of lncRNAs expressed in both vertebrate and invertebrates, alike. Systematic characterization of temporally and spatially restricted expression patterns in the developing nervous system has provided the groundwork for hypotheses regarding lncRNA function. As we understand more about the mechanism by which lncRNAs are regulating transcription, we are beginning to understand the biological significance of what once was labeled as "junk DNA." Many lncRNAs regulate transcription through regulation of epigenetics and interactions with chromatin-modifying complexes, although the mechanism by which lncRNAs are recruited to specific genomic loci is still unclear. Recently developed technologies have the potential to greatly expand our understanding of the mechanism by which lncRNAs function. The advent of techniques such as ChIRP and CHART allow for systematic characterization of DNA binding sites of lncRNAs throughout the genome (Chu et al., 2011; Simon et al., 2011). Additionally, protein arrays, as used to identify *Six3os* binding partners (Rapicavoli et al., 2011), allow for an unbiased approach to identifying physiologically relevant protein binding partners. These techniques will further our understanding of how lncRNAs function as molecular scaffolds and will enable the functional characterization of lncRNAs working in *trans*. While not the focus of this review, it is also essential to consider the function of lncRNAs that display cytoplasmic expression, which represent a large fraction of lncRNAs and whose function is poorly understood (reviewed in Batista and Chang, 2013). Further characterization of lncRNA–protein interactions through protein arrays will help facilitate these discoveries. As many cytoplasmic lncRNAs associate with ribosomes (van Heesch et al., 2014), it is intriguing to speculate that lncRNAs function as scaffolds to regulate localized protein synthesis and/or degradation, a concept vitally important in the control of synaptic function.

As we continue to understand the molecular basis of lncRNA function, it is imperative that studies move from *in vitro*, homogeneous cell populations and begin to examine the consequence within individual cell types. Neuronal diversification has exhibited a multitude of examples in which transcriptional regulation and

cell-fate decisions are very context and cell-type specific. Therefore, it is plausible that individual lncRNAs may display diverse functions that are dependent on their spatial and temporal expression pattern. Inherent to the examination of specific cell types is that epigenetic marks may display vast temporal and/or cell-type specific signatures. *In vivo* experiments continue to shed light on the importance of lncRNA function throughout neuronal development. As mouse models for genetic loss of lncRNAs such as *Evf2*, *Dlx1AS*, *Malat1*, and *Neat1* produce modest phenotypes or fail to recapitulate phenotypes observed in knockdown experiments (Bond et al., 2009; Zhang et al., 2012; Kraus et al., 2013), it is important to consider that lncRNAs may have evolved to function as a fine-tuning mechanism to ensure proper regulation of neuronal cell type proportions in the highly complex mammalian nervous system. Genetic compensation may mask phenotypes resulting from conventional gene knockout approaches, which conditional or acute loss of function studies may readily detect. Furthermore, efforts need to be made to carefully examine genetic models of lncRNA loss-of-function, however, being constantly mindful of the fact that many lncRNAs overlap conserved regulatory elements that may have function independent of the lncRNA itself, complicating interpretation of any observed phenotypes. Further exploration of lncRNA function will only continue to add to our appreciation of the complexity of transcriptional regulation, especially within the context of the seemingly endlessly complex development of the nervous system.

#### **ACKNOWLEDGMENTS**

The authors would like to thank W. Yap and T. Thein for critical assessments of this manuscript. This work is supported by NIH F32EY024201 (Brian S. Clark) and NIH R01EY017015 (Seth Blackshaw).

# **REFERENCES**


serum-freeBMP4 culture. *BMC Genomics* 8:365. doi: 10.1186/1471-2164- 8–365


phenotype: downregulated in multiple brain regions in schizophrenia. *Am. J. Psychiatry* 163, 1834–1837. doi: 10.1176/appi.ajp.163.10.1834


tandem UACUAAC repeats and associates with splicing factor-1. *Genes Cells* 16, 479–490. doi: 10.1111/j.1365-2443.2011.01502.x


hedgehog-responsive ventral forebrain progenitor species. *Proc. Natl. Acad. Sci. U.S.A.* 99, 16273–16278. doi: 10.1073/pnas.232586699


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 April 2014; accepted: 18 May 2014; published online: 06 June 2014. Citation: Clark BS and Blackshaw S (2014) Long non-coding RNA-dependent transcriptional regulation in neuronal development and disease. Front. Genet. 5:164. doi:*

*10.3389/fgene.2014.00164 This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Clark and Blackshaw. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Asymmetric localization of natural antisense RNA of neuropeptide sensorin in *Aplysia* sensory neurons during aging and activity

# *Beena M. Kadakkuzha†, Xin-An Liu†, Maria Narvaez , Alexandra Kaye , Komolitdin Akhmedov and Sathyanarayanan V. Puthanveettil\**

*Department of Neuroscience, The Scripps Research Institute, Jupiter, FL, USA*

#### *Edited by:*

*Yingqun Huang, Yale University School of Medicine, USA*

*Reviewed by: Tohru Yoshihisa, University of Hyogo, Japan Alessio Paone, Sapienza University of Rome, Italy*

#### *\*Correspondence:*

*Sathyanarayanan V. Puthanveettil, Department of Neuroscience, The Scripps Research Institute, Scripps Florida, 130 Scripps way, Jupiter, FL 33458, USA e-mail: sputhanv@scripps.edu*

*†These authors have contributed equally to this work.*

Despite the advances in our understanding of transcriptome, regulation and function of its non-coding components continue to be poorly understood. Here we searched for natural antisense transcript for sensorin (NAT-SRN), a neuropeptide expressed in the presynaptic sensory neurons of gill-withdrawal reflex of the marine snail *Aplysia californica*. Sensorin (SRN) has a key role in learning and long-term memory storage in *Aplysia*. We have now identified NAT-SRN in the central nervous system (CNS) and have confirmed its expression by northern blotting and fluorescent RNA *in situ* hybridization. Quantitative analysis of NAT-SRN in micro-dissected cell bodies and processes of sensory neurons suggest that NAT-SRN is present in the distal neuronal processes along with sense transcripts. Importantly, aging is associated with reduction in levels of NAT-SRN in sensory neuron processes. Furthermore, we find that forskolin, an activator of CREB signaling, differentially alters the distribution of SRN and NAT-SRN. These studies reveal novel insights into physiological regulation of natural antisense RNAs.

**Keywords: memory, nocoding RNA, antisense RNA, aging,** *Aplysia***, neural circuitry**

# **INTRODUCTION**

Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes and transcripts encoded on both strands of the genomic sequence. This overlapping gene organization, which produces sense-antisense transcript pairs, is capable of affecting regulatory cascades through established mechanisms. Natural antisense transcripts (NATs) are transcribed from the opposite strand to a protein coding or sense strand in the chromatin. Recent studies have provided ample evidence that more than 70% of the mammalian genome have antisense transcription potential.

Antisense transcription has been recognized for roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level (Faghihi and Wahlestedt, 2009; Pelechano and Steinmetz, 2013). Gene expression profiling studies show frequent concordant regulation of sense-antisense transcript pairs though there is clear evidence of discordant regulation, leading to significant physiological outcomes such as neurodegenerative diseases (Kadakkuzha et al., 2013). It has been shown that experimental modulation of an antisense transcript RNA can change the expression of sense transcript, supporting the role of antisense transcription to control of transcriptional outputs in higher animals (Katayama et al., 2005; Modarresi et al., 2011).

Despite efforts to unravel the specific role(s) of wide spread antisense transcription in recent years, the functional significance of NATs and their physiological regulation remains poorly understood. However, the significant presence of NATs in the central nervous system suggests their potential role in brain function. The role of NATs in establishing memory formation has been suggested in *Lymnaea* where axonal transport of NAT, described as antiNOS-2 RNA, is regulated by classical conditioning. AntiNOS-2 RNA negatively regulates the neurotransmitter nitric oxide (NO), a key transcript that plays an important role in the early stages of learning and memory formation (Korneev et al., 2013). It is not clear whether classical conditioning will lead to a net enhancement of antiNOS2 expression in the cell body and processes of CGC neuron or a selective increase in expression in the periphery.

To investigate physiological mechanisms that regulate expression and subcellular distribution of NATs, we have explored the expression of a NAT transcribed against the mRNA encoding the peptide neurotransmitter sensorin and its physiological regulation in the sensory neurons (SN) of the marine mollusk *Aplysia californica*. For more than 50 years *Aplysia* has provided fundamental insights into the basic organization of neuronal functions. *Aplysia's* nervous system has large neurons, many of them can be uniquely identified and are associated with specific behaviors. These neurons can be isolated and cultured *in vitro* and they form circuits, which can be investigated at the molecular and cellular detail.

The cell-specific neuropeptide, sensorin (SRN), is expressed exclusively in SNs and transported to distal neurites (Brunet et al., 1991). However, the distribution of sensorin transcripts in the SN cell bodies change when it is co-cultured with a motor neuron (Hu et al., 2002, 2003). It has been shown that formation and stabilization of sensory neuron (SN)-motor neuron (MN) synapses are regulated upon the release of sensorin peptide from SNs (Hu et al., 2004). *In vitro,* SNs uniquely make synapses with their *in vivo* target motor neurons but not with their non-target motor neurons, providing an excellent model system to specifically study the effects of specific mRNAs in synapse formation and stabilization (Kandel, 2001).

**Figure 1D** depicts the schematic diagram showing our strategy to study NAT-SRN in sensory neurons. Our analysis of gene expression using qPCR, northern blotting and single neuron quantitative PCR (qPCR) and fluorescent *in situ* hybridization (FISH) analyses have confirmed the expression of sense (SRN) and antisense RNAs (NAT-SRN) of neuropeptide sensorin in SNs of *Aplysia* gill withdrawal reflex. We then examined whether expression of NAT-SRN transcripts is regulated in SNs during aging and in response to forskolin, an activator of CREB (Seternes et al., 1999). We find that the expression levels and sub-cellular distribution of NAT-SRN are differentially altered during aging and neuronal activity.

# **MATERIALS AND METHODS**

#### **ETHICS STATEMENT**

The Institutional Biosafety Committee of The Scripps Research Institute (TSRI) has approved all of the experimental protocols (IBC Protocol 2010-019R1) described in this manuscript. Ethical approvals are not required for the research using invertebrate animals, such as the marine snail *Aplysia*.

#### **ANIMALS, ISOLATION OF** *APLYSIA* **NEURONS, AND CULTURE**

*Aplysia californica* maintained under standard conditions (temperature, salinity, pH, food) at the National *Aplysia* Resource Facility (University of Miami Rosenstiel School of Medicine, Florida, USA) were used in the experiment. In our study, we used animals that correspond to two age groups (2 and 9 months old). Upon arrival in the laboratory, animals were kept in an aquarium at 16◦C, under 12:12 light-dark conditions and was used for experiments within 2–3 days of arrival. Isolation of sensory neurons and culture were done as described earlier (Montarolo et al., 1986). Micro-dissection of cell body and processes were carried out as described earlier (Moccia et al., 2003; Moroz et al., 2006).

#### **NORTHERN BLOT**

A 415-bp DNA fragment corresponding to exons 1 and 2 of sensorin mRNA was cloned into TOPO Vector with dual promoters T7 and SP6 (Invitrogen, Cat. Number K4600-01) and then *in vitro* transcribed to prepare DIG labeled sense and antisense riboprobes using the DIG RNA Labeling Kit (SP6/T7) (Roche Diagnostics). The molecular sizes of riboprobes were confirmed by gel electrophoresis. Electrophoresis and transfer of the RNA was performed using DIG Northern Starter Kit (Roche Applied Science) and followed manufacturer's protocols. Briefly, 5 and 10μg of total RNA from *Aplysia* was run on 1% agarose gel, transferred to a positive nitrocellulose membrane following hybridization with strand-specific riboprobes at 68◦C overnight with ExpressHyb.

#### **RNA EXTRACTION AND REVERSE TRANSCRIPTION**

Total RNA was extracted from the *Aplysia* CNS using the standard RNA Trizol extraction method and dissolved in nuclease-free water. For the preparation of RNA from SNs, cell body and neurites were separated by manual micro-dissection followed by RNA extraction using Trizol. RNAs were amplified once using MessageAmp™ II aRNA Amplification Kit following manufacturer's instructions. For the CNS samples 1μg of the total RNA, and for the cell body and neurites, 500 ng of RNAs were used for reverse transcription using qScript cDNA synthesis mix.

#### **QUANTITATIVE REAL-TIME PCR**

Primer pairs for SRN, NAT-SRN, and ApKHC1 and ApKLC2 were designed using the Primer3 program (http://bioinfo*.*ut*.*ee/ primer3-0*.*4*.*0/) based on cDNA and genome sequences listed in UCSC genome browser database (http://genome*.*ucsc*.*edu/) (**Figure 1A**). While designing, primers steps are taken to avoid the amplification of multiple targets. Antisense primers were designed based on the exon boundaries of sense transcript to avoid potential amplification of the sense transcripts in the antisense detection samples. 2μl of the cDNA synthesized was used for qPCR following the protocol described earlier (Kadakkuzha et al., 2013; Akhmedov et al., 2014). Briefly, 10μl reactions contained 2μl of cDNA, 8μl of a qPCR master mix containing 2μl of H2O, 5μl of 2X SYBR Green master mix, and 1.0 μl of 10μM (each) forward and reverse primer. The reaction was carried out in a 7900HT Fast Real-Time PCR System (Applied Biosystems Carlsbad, CA) under the following conditions: 95◦C for 10 min, followed by 40 cycles of 95◦C for 15 s, 60◦C for 1 min. Five biological replicates and four technical replicates for each biological replicate were used in the qPCR. Quantification of the target transcripts was normalized to the *Aplysia*18S reference gene using the Pfaffl method (Pfaffl, 2001). Data are shown as mean ± s.e.m. Statistical analysis was performed using Prism (GraphPad Software). Student's *t*-test or ANOVA followed by Bonferroni's test were used as appropriate where ∗*P*-value *<* 0.05, ∗∗*P*-value *<* 0.01, ∗∗∗*P*-value *<* 0.001.

#### **PREPARATION OF DIG LABELED RIBO PROBES**

DIG labeled sense and antisense ribo probes for *in situ* hybridization probes were prepared by *in vitro* transcription of cDNA templates by using SP6 or T7 RNA polymerase. 415 nt long coding region of sensorin was prepared by PCR using *Aplysia* abdominal ganglion cDNA as a template and sensorin specific PCR primers and ligated to pCRII-TOPO Vector with dual promoters T7 and SP6 (Invitrogen, Cat. Number K4600-01). The Vector with the sensorin DNA was linearized with EcoR V (New England Biolab) for transcription using SP6 RNA polymerase to generate antisense probes and the corresponding sense probes were produced by linearizing with BamH I (New England Biolab) and transcribing with T7 RNA polymerase. A small aliquot (2μl) was run on 1.5% agarose gel to confirm the integrity of RNA probes.

### **FLUORESCENT** *IN SITU* **HYBRIDIZATION (FISH) AND IMAGING ANALYSIS**

DIG labeled RNA probes were prepared and *in situ* hybridization analysis of sensory neurons were carried out as described in (Puthanveettil et al., 2013). Images were acquired using a Zeiss LSM 780 confocal microscope system with 10X/63X objective; only projection images are shown. Mean fluorescence intensities

were measured using NIH IMAGE J and normalized intensities were calculated using the following equation:

Normalized intensity = Mean FISH intensity (SRN or NAT-SRN) − Mean background signal.

Distributions of β-tubulin protein in both cell body and neuritis were measured to identify any non-specific changes in protein expression associated with aging or forskolin treatment.

#### **RESULTS**

### **DETECTION OF SENSORIN SENSE AND ANTISENSE TRANSCRIPTS IN** *APLYSIA* **CNS**

Existence of sensorin antisense transcript (NAT-SRN) was previously suspected while performing *in situ* hybridization analysis of expression of sensorin in sensory neurons using fluorescently labeled sense (S) and antisense (AS) riboprobes (unpublished data). Sensorin mRNA (SRN) is transcribed from psc1 gene on the reverse strand of scaffold 926 of *Aplysia* genome (available through UCSC genome browser), spanning 40 KB long region and contains four exons (**Figure 1A**). In order to further characterize the sense (S) and the NAT at the psc1 locus, we searched for the presence of the NATs by northern blot, qPCR and *in situ* hybridization. Using DIG labeled ribo probes to detect SRN and NAT-SRN, we first analyzed total RNAs from the central nervous system (CNS) by northern hybridization analysis. **Figure 1B** show hybridization signals for SRN and NAT-SRN corresponding to ∼700 nucleotides suggesting that NAT-SRN transcripts are expressed in the CNS.

Using specific primers (Supplementary Table 1) that detect SRN and NAT-SRN cDNA, we then confirmed the presence of SRN and NAT-SRN by qPCR. Primers designed to amplify the intronic regions of the SRN gene were used as negative controls and to make sure that the products were not generated from genomic DNA. From multiple primers that we used we selected AS2 (**Figure 1A**) for further qPCR detection of NAT-SRN. Primers to detect putative NAT of *Aplysia* kinesin heavy chain (KHC1) and KLC2 (Supplementary Table 1) were used as additional controls in qPCRs (**Figure 1C**). Data was normalized to 18S rRNA levels. qPCR results showed that SRN was expressed at ∼30% higher levels when compared to NAT-SRN levels in CNS (*p* = 0*.*0235, ANOVA, **Figure 1C**).

#### **NAT-SRN IS EXPRESSED IN PROCESSES OF SENSORY NEURONS**

Previous studies have shown that release of sensorin from the SN is required for both synapse formation and long-term facilitation (LTF) of SN to MN connections. Localization of sensorin is modulated by the formation of synapses between SN and its target motor neurons (Lyles et al., 2006). We examined the distribution of both SRN and NAT-SRN isolated sensory neurons by qPCR. Analysis of RNAs isolated from micro dissected cell bodies and processes (**Figure 1C**) showed that both SRN and NAT-SRN transcripts are present in the neurites however, their levels in the cell bodies are much higher than that of neurites [∼2-fold decrease in expression of SRN and NAT-SRN transcripts in the neurites compared to cell body, **Figure 1E**, Student's *t*-test; *p* = 0*.*043 (SRN), *p* = 0*.*027 (NAT-SRN)]. As a control we examined expression of *Aplysia* KHC1 transcripts in SN neurites. Specific primers for KHC1 were unable to detect expression of antisense or sense transcripts in SN neurites.

#### **EXPRESSION OF NAT-SRN CHANGE DURING AGING**

We next examined whether expression of NAT-SRN is physiologically regulated. We first studied whether aging is associated with a change in the distribution of NAT-SRN and looked at the distribution of SRN and NAT-SRN in sensory neurons cultured from 3 to 9 month old *Aplysia.* From FISH analysis it is evident that both SRN and NAT-SRN are present in the cell body of SNs from young and old animals (**Figure 2A**). The level of SRN transcript did not change notably in the cell bodies of neurons from young and old animals (**Figure 2B**, *N* = 4, Students' *t*-test, *p* = 0*.*8854) indicating that SRN transcript level in the cell body is not affected by aging. However, we observed a significant increase in the distribution of NAT-SRN in the cell body of SNs from old animals (**Figure 2C**, % increase: 25 ± 8, *N* = 4, Students' *t*-test, *p* = 0*.*0362). As an endogenous control we used the expression of β-tubulin and found that the β-tubulin protein levels in cell bodies of young and old neurons did not change significantly (*N* = 4, Students' *t*-test, *p* = 0*.*7504) (**Figure 2D**).

We next analyzed the distribution of NAT-SRN in the processes of SNs cultured from young and old animals by FISH analysis and found moderate levels of SRN and NAT-SRN in the neurites of young and old animals (**Figure 3A**). Comparison of the mean intensities of FISH signals from young and old SRN and NAT-SRN after background signal subtraction suggested no change in the distribution of SRN transcript in the neurites of SNs from the young and old animals; an observation similar to what we found in the cell bodies described earlier (**Figure 3B**, *N* = 4, Students' *t*test, *p* = 0*.*0864). Also, we did not observe any significant change in the level of NAT-SRN in the neurites of SNs from old animals when compared to young animals (**Figure 3C**, *N* = 4, Students'

DIV4-cultured sensory neurons from young (3 months old) and old (9 months old) groups of *Aplysia* using fluorescently labeled sense and antisense ribo probes, respectively. RNA *in situ* hybridization analysis of **(D)** is the normalized intensity of β-tubulin protein distribution in young and old neurites. Normalized mean fluorescence intensities measured using NIH ImageJ are shown in bar graphs. Error bars are SEM. Student's *t*-test was used to determine statistical significance. "∗" is *p <* 0*.*05.

*t*-test, *p* = 0*.*9403). The endogenous control β-tubulin protein levels in processes of young and old neurons did not change significantly (*N* = 4, Students' *t*-test, *p* = 0*.*7701) (**Figure 3D**). These results suggest that the expression and localization of NAT-SRN transcripts are regulated during aging.

#### **FORSKOLIN EXPOSURE ENHANCE TRANSCRIPTION OF NAT-SRN IN BOTH CELL BODIES AND NEURITES**

We next studied whether NAT-SRN levels could be regulated by forskolin, an activator of cAMP-CREB signaling. SN

cultures (4 DIV) were treated with 50μM forskolin for 30 min (Puthanveettil et al., 2008) and fixed for FISH analysis (**Figure 4A**). Vehicle treated SNs were used as controls. Analysis of mean fluorescence intensities of expression of SRN and NAT-SRN suggest that forskolin treatment induced 25% increase in the expression level of SRN in the cell body (**Figure 4B**) but no change in NAT-SRN level (Student's *t*-test; *p* = 0*.*0007). Interestingly, there was no change in the level of SRN in the neurites of SNs treated with forskolin within 30 min of treatment but the NAT-SRN level was increased by 50% after forskolin treatment in the neurites (**Figures 4D,E**; Student's *t*-test; *p* = 0*.*001). β-tubulin protein levels in the cell body (**Figure 4C**) and processes (**Figure 4F**) of control and forskolin treated neurons did not change significantly (*N* = 4, Students' *t*-test, *p* = 0*.*7225 and 0.6712, respectively).

# **DISCUSSION**

Advances in sequencing methodologies have led to sequencing of several genomes and transcriptomes shedding light on the noncoding component of the genome. These large-scale sequencing studies have resulted in cataloging of thousands of non-coding RNAs in a variety of organisms (Katayama et al., 2005; Li et al., 2013). We have now begun to understand the functional relevance of transcription of non-coding RNAs and physiological mechanisms that regulate transcription of non-coding RNAs. Non-coding RNAs include miRNAs, piRNAs, tRNAs, rRNAs, snRNAs, large non-coding RNAs, and natural antisense RNAs. NAT are intriguing because these RNAs have transcripts complementary to other RNA transcripts. Several regulatory roles for NATs have been suggested including RNA interference, genomic imprinting, and alternate splicing (Zhang et al., 2007; Faghihi and Wahlestedt, 2009; Werner, 2013; Wight and Werner, 2013).

Recent studies have demonstrated that NATs could downregulate expression of its complementary transcripts. For example, in the rice plant, NATs regulate expression of a protein important for phosphate homeostasis (Jabnoune et al., 2013). Similarly NAT of interleukin (IL) 1 beta) (Lu et al., 2013), iNOS (inducible nitric oxide synthase) (Yoshigai et al., 2013), and ubiquitin cterminal hydrolase (uch) (Carrieri et al., 2012), Huntington's disease (Chung et al., 2011) suppress expression of their complementary transcripts. Importantly, inhibition of expression of NATs has resulted in upregulation of gene specific transcription (Modarresi et al., 2012). Recently, (Velmeshev et al., 2013) showed that 40% of loci previously implicated in autism spectrum disorders express NATs. These NATs are expressed in specific brain regions.

We are only beginning to understand the physiological processes and mechanisms that regulate expression of NATs. In plants, drought alters expression of NATs (Lembke et al., 2012). Similarly during beta amyloid induced apoptosis, NAT of Rad18 gene become upregulated (Parenti et al., 2007). Also, during corticogenesis in mouse brain the expression of NAT of Nrgn and CamK2n1 are regulated (Ling et al., 2011). Additionally, it has been shown that sense-antisense transcript pairs are present in synaptoneurosomes (Smalheiser et al., 2008).

Despite these elegant studies, we still do not know whether and how NATs are regulated in specific neural circuitries. To address this we used well-characterized neural circuitry in *Aplysia*, the sensory-motor neurons of gill withdrawal reflex (Kandel, 2001) and studied potential NAT of neuropeptide sensorin (NAT-SRN) in sensory neurons. Neuropeptide sensorin is expressed in presynaptic sensory neurons and is important for LTF of sensory to motor neuron synapses (Brunet et al., 1991; Schacher et al., 1999). Sensorin RNA is transported to synapses (Schacher et al., 1999; Moccia et al., 2003) and translated in response to repeated 5-HT stimulation (Wang et al., 2009). To search for putative NAT-SRN, we first analyzed *Aplysia* genome and designed primers to prepare sense and antisense probes for northern analysis. We find that NAT-SRN is expressed in *Aplysia* CNS and that it probably has a similar molecular weight as compared to complementary sense transcript. We next confirmed the expression of NAT-SRN by qPCR analysis of CNS and micro-dissected cell body and processes of individual SNs. Having confirmed the expression

of NAT-SRN in SNs, we then asked two questions: (a) whether NAT-SRN is physiologically regulated and (b) whether NAT-SRN expression could be regulated by cellular activities that elicit specific signaling pathways leading to memory storage.

To understand physiological regulation of NAT-SRN, we first determined whether its expression changes during aging and whether aging causes changes in subcellular distribution of NAT-SRN. Importantly, aging associated changes in specific NATs are poorly understood. We studied two age groups, young and old *Aplysia*. Our FISH analyses suggest that the amount of NAT-SRN increases in cell body of old neurons when compared to corresponding sense transcripts. However, the NAT-SRN level is decreased in neurites of sensory neurons cultured from old animals.

To determine whether specific cellular activities might regulate expression of NAT-SRN, we measured changes in expression Kadakkuzha et al. Natural antisense RNA of sensorin

of NAT-SRN in response to forskolin, an activator of cAMP-PKA-CREB pathway important for long-term memory storage (Kandel, 2001). Recently (Korneev et al., 2013) have shown that classical conditioning of *Lymnaea* changes in the expression of NAT of nitric oxide synthase. Consistent with the idea that NATs could be physiologically regulated, we find that immediately after forskolin treatment, there is an increase in SRN transcripts in the cell body. However, forskolin did not cause an increase in the expression of NAT-SRN in the cell body. Interestingly we find that forskolin exposure resulted in a significant increase in NAT-SRN transcripts in the neurites.

The differential subcellular localization of sense and NAT-SRN during aging and in response to forskolin treatment suggest that there might be specific mechanisms that mediate differential expression and transport of SRN and NAT-SRN transcripts. We consider three possible mechanisms: (a) regulation of transcription of NAT-SRN or (b) degradation of NAT-SRN in specific compartments, or (c) changes in axonal transport of NAT-SRN to neurites. It has been shown that expression of NATs could be regulated by epigenetic mechanisms (Conley and Jordan, 2012). We find that exposure to forskolin cause a rapid increase in sense transcripts in cell body and increase in NAT-SRN in neurites. Our observation that forskolin did not cause upregulation of NAT-SRN in the cell body suggests the possibility that transcription of NAT-SRN is not regulated by CREB. We have previously shown that (Puthanveettil et al., 2008) forskolin treatment cause a rapid increase in kinesin mRNA levels, the molecular motor that mediates axonal transport, in sensory neurons. However, it is yet to be determined whether NAT-SRN is transported by kinesin and whether increase in NAT-ARN in SN neurites correlate with enhanced expression of kinesin mRNAs.

In summary, we have identified expression of natural antisense RNA of sensorin, an important neuropeptide involved in memory storage. We find that its subcellular localization in sensory neurons is differentially regulated during aging and in response to activation of cAMP-PKA-CREB pathway. We now provide evidence that NATs could be differentially regulated in different sub-cellular compartments. Further studies are required to delineate mechanisms and understand physiological implications of such regulation.

# **ACKNOWLEDGMENTS**

We sincerely thank the Whitehall Foundation, NIMH grant 1 R21MH096258-01A1 and The Scripps Research Institute for their funding support which helped us carry out this research.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fgene*.*2014*.* 00084/abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 November 2013; accepted: 29 March 2014; published online: 22 April 2014.*

*Citation: Kadakkuzha BM, Liu X-A, Narvaez M, Kaye A, Akhmedov K and Puthanveettil SV (2014) Asymmetric localization of natural antisense RNA of neuropeptide sensorin in Aplysia sensory neurons during aging and activity. Front. Genet. 5:84. doi: 10.3389/fgene.2014.00084*

*This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Kadakkuzha, Liu, Narvaez, Kaye, Akhmedov and Puthanveettil. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 25 March 2014 doi: 10.3389/fgene.2014.00057

# *Jan-Wilhelm Kornfeld1,2 and Jens C. Brüning1,2,3 ,4 \**

<sup>1</sup> Cologne Excellence Cluster on Cellular Stress Responses in Aging Associated Diseases, Köln, Germany

<sup>2</sup> Max-Planck-Institute for Neurological Research, Köln, Germany

<sup>3</sup> Department of Mouse Genetics and Metabolism and Center for Molecular Medicine Cologne, Institute for Genetics at the University Hospital of Cologne, University of Cologne, Cologne, Germany

<sup>4</sup> Center for Endocrinology, Diabetes and Preventive Medicine, University Hospital Cologne, University of Cologne, Cologne, Germany

#### *Edited by:*

Romano Regazzi, University of Lausanne, Switzerland

#### *Reviewed by:*

Bernard Mari, Centre National de la Recherche Scientifique, France Zhengyu Jiang, Fox Chase Cancer Center, USA

#### *\*Correspondence:*

Jens C. Brüning, Max-Planck-Institute for Neurological Research, Gleueler Strasse 50, 50931 Köln, Germany e-mail: ai023@uni-koeln.de

Our understanding of genomic regulation was revolutionized by the discovery that the genome is pervasively transcribed, giving rise to thousands of mostly uncharacterized non-coding ribonucleic acids (ncRNAs). Long, ncRNAs (lncRNAs) have thus emerged as a novel class of functional RNAs that impinge on gene regulation by a broad spectrum of mechanisms such as the recruitment of epigenetic modifier proteins, control of mRNA decay and DNA sequestration of transcription factors. We review those lncRNAs that are implicated in differentiation and homeostasis of metabolic tissues and present novel concepts on how lncRNAs might act on energy and glucose homeostasis. Finally, the control of circadian rhythm by lncRNAs is an emerging principles of lncRNA-mediated gene regulation.

**Keywords: lncRNAs, glucose homeostasis, metabolism and obesity, non-coding RNA (ncRNA), cell differentiation**

#### **INTRODUCTION**

#### **THE NON-CODING GENOME**

The canonical view of mammalian genomes revolves around the notion that the roughly 20,000 proteins within mammalian genomes are interspersed by somewhat conserved, yet functionally redundant non-coding regions with only limited regulatory potential. Regulatory properties of these "non-coding" regions were only attributed to *cis*-regulatory elements such as promoters or *cis/trans*-enhancer regions. This paradigm was fundamentally called into question by results obtained from whole-transcriptome sequencing efforts [e.g., by the ENCODE consortium (Birney et al., 2007; Thomas et al., 2007)] over the last decade that have revealed the pervasive transcription of mammalian genomes (Carninci et al., 2005; Birney et al., 2007; Derrien et al., 2012). Although the magnitude of pervasiveness remains under debate (van Bakel et al., 2010; Clark et al., 2011), recent meta-analyses of human ribonucleic acid-sequencing (RNA-Seq) datasets have confirmed that >80% of genomic sequences are rediscovered within RNA transcripts, often in a temporally and spatially specific manner (Hangauer et al., 2013). One logical consequence of pervasive transcription is the abundance of non-coding RNAs (ncRNAs) within mammalian genomes, a phenomenon which holds true for most eukaryotic species ranging from yeast (David et al., 2006), to *Drosophila* (Stolc et al., 2004), plants (Li et al., 2006) and humans (Hangauer et al., 2013). Given the predicted high number of ncR-NAs within mammalian genomes, which probably surpasses that of coding genes, it is not surprising that a large conceptual void remains about the multifaceted role of ncRNAs in regulation of gene expression. Researchers have historically divided ncRNAs into small ncRNAs (sRNAs <200 nt length) such as microR-NAs (miRNAs) and small nucleolar RNAs (snoRNAs) in contrast to so-termed long ncRNAs (lncRNAs; >200 nt length). Until today, the identification of biological processes, which are regulated by miRNAs as well as the elucidation of mRNA targets, which are posttranscriptionally regulated by disease-associated miRNAs remains an important focus of research (Bartel, 2009; Carthew and Sontheimer, 2009). It was demonstrated that the spectrum of biological processes, which are regulated by miR-NAs, ranges from the development of organs, the homeostatic regulation of cellular metabolism to aging and neurodegenerative disorders. Although miRNAs are central to the understanding of the non-coding genome, the regulation of energy homeostasis and metabolism by miRNAs has been meticulously reviewed elsewhere (Lynn, 2009; Rottiers and Naar, 2012; Kim and Kyung Lee, 2013) and goes beyond the scope of this review. In contrast to miR-NAs, the role of lncRNAs in control of metabolism and energy homeostasis remains rather elusive. Thus, we here review the known roles for lncRNAs, which probably constitute the numerical majority of ncRNAs encoded within mammalian genomes, during differentiation, homeostasis and metabolic regulation of tissues (**Figure 1**).

#### **LONG NON-CODING RNAs**

Those lncRNAs that were initially discovered in the late eighties had distinct, at that time considered exotic functions such as X chromosome inactivation in females by the lncRNA *XIST* (Penny et al., 1996). Another historical example was the imprinted lncRNA *H19*, which is involved in repression of *Igf2* (Pachnis et al., 1988). After the discovery of pervasive genomic transcription it became clear that lncRNAs do not represent an exotic observation, but rather a prominent feature of the genome (Birney et al., 2007). Although the number of lncRNAs is still debated, recent meta-analyses posit the human genome to give rise to >60,000 lncRNA, albeit the majority is probably expressed at low levels (Derrien et al., 2012; Hangauer et al., 2013; for current lncRNA numbers consult NONCODE Version 4, www.noncode.org). Interestingly, lncRNAs on one hand exhibit many similarities with protein-coding transcripts: As true for mRNAs, lncRNAs

are transcribed by RNA-polymerase (Pol) II (Guttman et al., 2009), spliced at canonical splicing sites (Chew et al., 2013), are partly polyadenylated (Cabili et al., 2011) and even associate with polysomes (Guttman et al., 2013). Further, lncRNAs harbor the same chromatin marks of H3K4 and H3K36 trimethylation as found in active promoters and transcribed regions of proteincoding transcripts, respectively, a phenomenon which aided in the identification of novel lncRNAs (Mikkelsen et al., 2007). It is noteworthy that the notion of distinct mRNA-like trimethylation marks on actively transcribed lncRNAs is incompatible with the criticism brought forward according to which lncR-NAs are merely generated by unspecific Pol II activity which leads to low-level transcription of non-coding sequences ("transcriptional noise"; Derrien et al., 2012). On the other hand, certain features set lncRNAs apart from protein-coding genes: Generally, lncRNAs are expressed at lower levels, are less evolutionarily conserved and less frequently associate with ribosomes than protein-coding transcripts (Hangauer et al., 2013). Further, lncRNA are shorter than coding genes and are composed of a unique gene structure of usually 1–2 exons. Of note, some lncRNAs do give rise to small peptides and may act as both, coding and non-coding, transcript (reviewed here Dinger et al., 2008).

### **PRINCIPLES OF lncRNA-MEDIATED GENE REGULATION**

Currently, intensive research efforts are underway to better understand the molecular basis of gene regulation by lncRNAs. To date, four major paradigms have emerged on how lncRNA impinge on gene regulation:

#### **(EPIGENETIC) REGULATION OF GENE TRANSCRIPTION**

LncRNAs are able to bring gene-regulatory DNA-binding proteins and DNA sequences into close proximity and thus constitute an ideal docking platform for recruiting epigenetic modifiers to distinct genomic loci in *cis*- or *trans*. Indeed, early insights into lncRNA-based gene regulation have revealed the recruitment of the inhibitory polycomb repressive complex (PRC) 2 and the activating Trithrorax/MLL chromatin modifiers to specific genomic loci by the lncRNAs *HOTAIR* (Rinn et al., 2007) and *HOTTIP* (Wang et al., 2011), respectively. PRC2 and MLL then mark distinct lysine residues within histones via trimethylation, leading to inhibition or activation of gene transcription. In a similar fashion, the lncRNA *ANRIL* silences the *INK4a* tumor suppressor allele by H3K27 trimethylation via recruiting the Polycomb chromatin modifier CBX7 (Yap et al., 2010). The percentage of lncRNAs implicated in (epigenetic) gene regulation was systematically quantified by interrogating the PRC2 interactome using chromatin-state maps. This revealed the abundant interaction of Polycomb repressor proteins with up to 20% of expressed lncR-NAs (Khalil et al., 2009). Thus, one prominent role of lncRNAs relates to writing and erasing chromatin marks, thereby controlling the epigenetic state of lncRNA-bound genomic loci (Spitale et al., 2011). In a systematic attempt to interrogate the function of 3019 human lncRNAs, Orom et al. (2010) revealed that a significant portion of the lncRNome possesses *cis*-regulatory enhancer properties (hitherto termed enhancer-like RNAs, eRNAs), which control the expression of neighboring protein-coding genes. Elegant follow-up studies using chromosome conformation capture (3C) technology revealed that the co-activator complex Mediator is involved in tethering eRNAs to their gene targets. Hence, lncRNAs regulate the three-dimensional (3D) structure of chromosomes via Mediator-dependent chromosome looping (Lai et al., 2013), thereby bridging large intra- and interchromosomal distances in order to activate distal promoters (reviewed in Orom and Shiekhattar, 2013). This study nicely complemented reports about the lncRNA-mediated regulation of *HOXA* genes, in which chromosomal looping brings the eRNA *HOTTIP* in proximity to its target genes, marks the chromatin by H3K4 trimethylation and thus activates gene transcription (Wang et al., 2011). Taken together, the translation of the information content that lies within higher-order (3D) structures of chromosomes into (epigenetic) modifications of chromatin and regulation of gene transcription seems to be an emerging principle of lncRNA function.

#### **PROCESSING/DEGRADATION OF mRNA**

Every step of RNA metabolism is subjected to fine-tuned and complex regulation (reviewed in Moore, 2005). LncRNAs have recently been involved in the control of RNA stability, the processing of (pre)-mRNAs and the regulation of mRNA decay. Natural antisense transcripts (NATs) are lncRNAs which are characterized by their location antisense to other coding or non-coding transcripts (Faghihi and Wahlestedt, 2009). The upregulation of NATs often causes downregulation of protein-coding transcripts on the opposite strand by the formation of RNA duplexes and triggering of cellular RNAi, although recently the NAT-mediated upregulation of protein-coding transcripts on the opposite strand were reported (Carrieri et al., 2012). The repressive effect of NATs onto the opposite strand not only holds true for duplexes consisting of (i) a protein-coding mRNA and a non-coding lncRNA NAT, but also for (ii) duplexes between a lncRNA NAT and another

lncRNAs as demonstrated for lncRNAs which base-pair with and target the *PTEN* pseudogene *PTENpg1* (Johnsson et al., 2013). The lncRNA-mediated regulation of (pre)-mRNA processing was demonstrated for the nuclear retained lncRNA *MALAT1*, which modulates alternative splicing via assembly of serine/arginine splicing factors within subnuclear compartments called nuclear speckles (Tripathi et al., 2010). Finally, the timely degradation of mRNAs by a process called Staufen-mediated decay (SMD) involves lncRNA-recipient sequences in the 3- UTR of SMD target genes. Here, intermolecular base-pairing between lncRNAs and sequences within the 3- UTR of Staufen (Stau) target genes triggers a cellular SMD response and the ensuing degradation of the transcript (Gong and Maquat, 2011; Wang et al., 2013). The *in vivo* significance of lncRNA-mediated SMD decay was underscored by the observation that epidermal differentiation critically depends on lncRNA-elicited mRNA decay. Here, a lncRNA called *TINCR* is recruited to specific sequences called "TINCR box" within *TINCR* target genes and elicits the decay of TINCR-bound transcripts by SMD (Kretz et al., 2013; Kretz, 2013).

### **POSTTRANSCRIPTIONAL GENE REGULATION**

MiRNAs recognize their gene targets by binding to 6–8 nt sequences called "seeds" located in the 3- UTR region of proteincoding RNAs (Bartel, 2009). It was demonstrated that recognition, binding and degradation/translational inactivation of miRNA targets is not necessarily confined to protein-coding transcripts. A so-called "competing endogenous RNA (ceRNA) hypothesis" was brought forward according to which protein-coding RNAs, miRNAs, and lncRNAs transcripts form large-scale regulatory networks which impinge on the expression of other transcripts independently of protein translation via competing for a limited pool of miRNAs (Salmena et al., 2011). Here, transcripts, so-called ceRNAs regulate the expression of other transcripts based on the similarity of their 3- UTR miRNA response elements (MRE) profile. According to this notion, two transcripts with a strong degree of common MREs can crosstalk to each other by competing for a given pool of miRNAs. Upregulation of one ceRNA increasingly "sponges" a limited pool of miRNAs and relieves the miRNA-mediated repressive tone on ceRNAlinked transcripts. The experimental confirmation of a ceRNA-like interdependency of protein-coding transcripts was first demonstrated for the tumor suppressor gene *PTEN* (Karreth et al., 2011; Tay et al., 2011), which "crosstalks" to hitherto unknown tumor suppressors. PTEN loss-of-function during cancerogenesis is also controlled by the genomic loss of its (non-coding) pseudogene *PTENP1* which acts as a ceRNA (Poliseno et al., 2010). An additional layer of posttranscriptional regulation by lncRNAs accordingly lies within the specific pattern of MREs within lncRNAs which allow it to influence the expression of coding and non-coding transcripts in a ceRNA-like fashion. This is exemplified by the upregulation of a non-coding antisense homologue of the beta-secretase BACE1 (*BACE1-AS*) that acts as BACE1 ceRNA and concomitantly increases Bace1 mRNA stability and leads to augmented deposition of Aβ-plaques in Alzheimer's disease (AD; Faghihi et al., 2008, 2010). Finally, a novel, intriguing class of functional lncRNAs, which is encoded

in eukaryotic genomes, is constituted by circular RNAs (circRNAs). CircRNAs are expressed at high levels, can act as ceRNAs and effectively sponge miRNAs as shown for the neuroendocrine miRNA miR-7 (Hansen et al., 2013; Memczak et al., 2013).

# **REGULATION OF PROTEIN ACTIVITY**

Ribonucleic acid possesses the unique biochemical property to recognize and bind most biomolecules including proteins with unprecedented affinity (Stoltenburg et al., 2007). Thus, lncRNAs can specifically bind proteins and elegant studies have attributed novel roles for lncRNAs in the control of tissue homeostasis via direct binding and modification of protein activity. For example, a lncRNA termed *Evi2* was shown to form stable complexes with members of the Dlx/Dll family of transcription factors, which are crucial regulators of developmental timing in vertebrates, and thereby regulate their transcriptional output (Feng et al., 2006). Further, two lncRNAs termed *PRNCR1* and *PCGEM1* that are upregulated in aggressive prostate cancer, synergistically and coordinately bind the carboxyterminal part of the androgen receptor (AR) and are required for AR-dependent gene transcription. In androgen-refractory prostate cancer, *PRNCR1* and *PCGEM1* are robustly expressed and are implicated in the ligand-independent activation of AR signaling [AR "resistant" prostate cancer (Yang et al., 2013)]. Another emerging paradigm of lncRNA-mediated regulation of protein activity is the sequestration of transcription factors as exemplified by the lncRNA *Gas5*, which is induced under conditions of nutrient deprivation and cellular stress. *Gas5* acts as glucocorticoid receptor (GR) decoy by competing with GR-responsive elements (GREs) in gene promoters for binding to the DNA-binding domain of the GR (Kino et al., 2010). Increased levels of *Gas5* thus interfere with GR binding to the DNA and effectively inhibit transactivation of GR-dependent gene promoters. Another example is nuclear factor kappa b (NFkB) signaling, which translates extracellular, proinflammatory cues [e.g., by tumor necrosis factor alpha (TNF-α) receptor activation] into changes in gene expression. NFkB activation induces the transcription of a specific subset of lncRNAs, apart from the induction of classical inflammatory protein-coding genes. Among this subset of TNF-regulated lncR-NAs, a lncRNA termed *Lethe* is recruited to the NFkB effector subunit RelA in an inducible fashion and inhibits RelA from DNAbinding and target gene activation (Rapicavoli et al., 2013). Finally, the hypoxia-regulated lncRNA *linc-p21* was shown to physically interact with hypoxia-inducible factor (HIF) 1alpha transcription factors. This HIF1a-*linc-p21* circuit controls the hypoxia-evoked increases of the glycolytic "Warburg effect" in tumor cells (Yang et al., 2014).

# **LncRNAs IN CONTROL OF METABOLISM**

The regulation of metabolism and glucose homeostasis is orchestrated and fine-tuned by a complex interplay of tissues/organs. Currently, we are faced with an unprecedented rise of obesity in the civilized world and the concurrent increase in obesityassociated diseases such as insulin resistance and type 2 diabetes mellitus (T2D). Key to the understanding of whole-body metabolism are the pleiotropic effects of the anabolic master regulator insulin which simultaneously controls peripheral as well as central-nervous system-related aspects of metabolism (Kahn et al., 2006). Resistance toward the effects of insulin constitutes a key step in the development of metabolic disease. The exciting observation that insulin and insulin-like growth factor (IGF) 1 signaling also triggers distinct changes in lncRNA expression [e.g., of the lncRNA *CRNDE* (Ellis et al., 2013)] points to the fact that lncRNAs may also be implicated in the metabolic effects of insulin and the development of insulin resistance. Thus, a strong interest lies within the identification of lncRNA-mediated mechanisms governing energy and glucose homeostasis at the cell-intrinsic, organ and whole-body level.

# **TISSUE-SPECIFIC REGULATION OF METABOLISM BY lncRNAs**

#### **MAINTENANCE OF PANCREATIC BETA CELL IDENTITY**

The main function of pancreatic islets lies within the synthesis, storage and secretion of insulin and glucagon, two hormonal regulators of glucose homeostasis. The possible control of islet development and function by lncRNAs was first demonstrated in studies which reported that the lncRNA *H19* is involved in the intergenerational transmission of diabetes mellitus [gestational diabetes mellitus (GDM)] and the GDM-associated impairments of islet infrastructure and function (Ding et al., 2012). Global lncRNA screening approaches conducted by Moran et al. (2012) systematically interrogated the lncRNA transcriptome in human pancreatic beta cells. Here, the dynamic, strand and tissue-specific regulation of >1,000 lncRNA was reported using integrated transcriptional and chromosomal maps. Utilizing RNA-Seq data of 16 non-pancreatic tissues, the aforementioned gene set of pancreatic lncRNAs was shown to be significantly more specific for islet cells (40–55% for intergenic and antisense lncRNAs, respectively) than protein-coding genes (9.4%). Furthermore, the upregulation of islet-specific lncRNAs during progenitor commitment, glucose-stimulated upregulation and the striking dysregulation of islet-specific lncRNAs in patients with T2D pointed to a pathophysiological role of lncR-NAs in the homeostasis of pancreatic tissues. The fact that a significant percentage of mouse and human lncRNA orthologs display similar cell- and stage-specific expression patterns suggests that evolutionarily conserved properties of lncRNAs extend beyond their primary sequence. This study was corroborated by a publication from the McManus laboratory, which presented a new catalog of the human beta cell (non-coding) lncRNA transcriptome in which >1,000 lncRNA were expressed in an islet-specific fashion involving islet-specific splicing events and promoter utilization (Ku et al., 2012). However, the elucidation of the molecular mechanisms underlying lncRNA-mediated regulation of beta cell differentiation and function still await discovery.

#### **REGULATION OF ADIPOGENESIS AND ADIPOSE TISSUE PLASTICITY**

The body harbors two principal types of adipose tissues which possess key functions in regulating the equilibrium between nutrient deposition and energy expenditure: Whereas white adipose tissue (WAT) serves as storage organ for excess nutrients, brown adipose tissue (BAT) dissipates the proton gradient across mitochondrial membranes to generate heat via the BAT-intrinsic uncoupling protein 1 (UCP1; Bartelt and Heeren, 2012). The accumulation of excess lipids that leads to low-grade inflammation inWAT has been linked to the development of insulin resistance in obese patients (Saltiel and Kahn, 2001; Gregor and Hotamisligil, 2011; Glass and Olefsky, 2012). Also, impaired BAT thermogenesis can contribute to the development of insulin resistance and obesity (Connolly et al., 1982; Feldmann et al., 2009). The fact that lncRNAs are implicated in the differentiation of adipose tissues (adipogenesis) is exemplified by the lncRNA *SRA*, which is required for full transactivation of the proadipogenic transcription factor Peroxisome proliferator-associated receptor gamma (Pparg). Concomitantly, RNAi-mediated *SRA* loss-of-function interfered with *in vitro* differentiation of 3T3-L1 preadipocytes (Xu et al., 2010). In a seminal study by Sun et al. (2013), the systematic implication of lncRNAs during adipogenesis was addressed. Using global transcriptome profiling of undifferentiated and mature adipocytes from the WAT and BAT lineages, the significant and specific regulation of 175 lncRNAs during adipogenesis was reported (Sun et al., 2013) of which a significant portion were enriched within adipose tissues. Finally, subsets of newly identified lncR-NAs termed lncRAPs (lncRNAs Regulated in AdiPogenesis) were depleted *in vitro* using siRNAs. Distinct lncRAPs, which were specifically upregulated during adipogenesis and were induced by the proadipogenic transcription factors Cebpa and Pparg, were required for timely and complete maturation of adipocyte progenitor cells. These studies provide first evidence for a crucial role of lncRNAs in the control of adipogenesis and fat cell metabolism.

#### **DIFFERENTIATION OF SKELETAL MUSCLE AND CARDIOMYOCYTES**

The differentiation of skeletal muscle cells (myogenesis) is regulated by a complex, yet well understood, evolutionarily conserved circuitry of protein-coding genes which control the timely growth, morphogenesis, and terminal maturation of muscle progenitors (myoblasts; Buckingham and Vincent, 2009). Here, the implication of noncoding RNAs was first shown via the contribution of myogenic miRNAs (myomiRs like miR-1 and miR-133) during myoblast commitment (Chen et al., 2006; Liu et al., 2008). Gong and Maquat (2011) reported that in human cells the degradation of distinct, nascent coding transcripts by Staufen-mediated decay (SMD) was regulated by lncRNAs. Here, the intermolecular base-pairing between *Alu* elements located within the 3- UTR of an SMD target and an *Alu* site localized within a class of lncR-NAs called *1/2sbsRNAs* (1/2-STAU1-binding site RNAs) triggered SMD (Gong and Maquat, 2011). This process was interestingly conserved in rodents that lack canonical *Alu* repeats. Here, the mouse homologue of *1/2sbsRNA* was shown to be implicated in terminal differentiation of myoblast cells (Wang et al., 2013), indicating a function of lncRNAs in myogenesis. Another lncRNA termed *linc-MD1* is also critical for myogenesis. Here, increased levels of *linc-*MD1 trigger the muscle differentiation program by acting as a natural decoy for myomiRs miR-133 and miR-135 (Cesana et al., 2011). MiR-133 and -135 in turn repress the expression of two pro-myogenic transcription factors, MAML1 and MEF2C. Recent reports also revealed that *linc-MD1* takes part

in a molecular feedforward circuit involving the promyogenic protein HuR (Legnini et al., 2014). Collectively, *linc-MD1* promotes terminal differentiation of myoblasts via acting as ceRNA for myogenic transcriptional regulators by sequestering anti-myogenic miRNAs. Interestingly, this complex ceRNA-based interplay of classical mRNAs, lncRNAs, and miRNAs was dysregulated in patients suffering from Duchenne muscular dystrophy (DMD), a condition of reduced terminal differentiation of myoblasts. Reinstating DMD-associated downregulation of *linc-MD1* expression via lentiviral delivery led to improved maturation of DMD myoblasts. In a study published by Klattenhoff et al. (2013), the heart-intrinsic lncRNA *Braveheart* (*Bvht*) was demonstrated to be required for differentiation of mesodermal progenitors toward mature cardiomyocytes via interaction with PRC2 epigenetic modifiers. This report for the first time implicated a tissue-specific lncRNA in maintaining cell fate during mammalian organogenesis.

#### **REGULATION OF NEUROGENESIS BY lncRNAs**

The discovery that peripherally secreted hormones such as insulin and leptin control energy homeostasis and glucose metabolism via CNS-acting neurocircuits expanded our understanding on how the body ingests, stores and dissipates energy (Belgardt and Bruning, 2010). In an approach to identify lncRNAs, which are implicated in brain development and neurogenesis, Aprea et al. (2013) utilized transgenic *in vivo* approaches to isolate neural stem cells, partially committed neuronal precursor cells as well as terminally differentiated neurons and quantified the expression of lncRNAs. Several lncRNAs were identified that were involved in neurogenesis, neuroblast commitment and neuron survival as shown for the neuroregulatory lncRNA *Miat*. Thus, maintenance of the neuron stem cell pool and terminal differentiation of neuron progenitors are also under lncRNA-mediated control. This will hopefully entail studies in the future specifically addressing the regulation of defined neuronal circuits, which regulate peripheral metabolic by lncRNAs.

#### **REGULATION OF CIRCADIAN RHYTHM BY lncRNAs**

The mammalian clock plays a fundamental role in the regulation of energy and glucose homeostasis. Dysregulation of the circadian rhythm underlies several metabolic pathologies like the development of insulin resistance and the metabolic syndrome (Marcheva et al., 2010; Hatori et al., 2013). In addition to the central clock located in the suprachiasmatic nucleus (SCN) of the pineal gland in the CNS, subordinate, tissue-specific clocks exist which are also key for the regulation of diurnal aspects of lipid metabolism, oscillations in core body temperature and timely insulin secretion from pancreatic beta cells (Cretenet et al., 2010; Marcheva et al., 2010; Gerhart-Hines et al., 2013). Interestingly, as found for plants (Hazen et al., 2009), lncRNAs are involved in the regulation of vertebrate circadian systems. A study published by Coon et al. showed that 112 lncRNAs are differentially expressed between day/night within the pineal gland of rats (Coon et al., 2012). An in-depth investigation of eight highly rhythmic lncRNA revealed the pivotal role of neuronal projections from the SCN as well as external zeitgebers like light exposure onto periodicity and amplitude of circadian lncRNAs. In addition, a report from

Vollmers et al. (2012) observed that rhythmic expression of ncR-NAs like NATs, lncRNAs and miRNAs leads to rhythmic chromatin modifications in the liver. Noteworthy, the circadian oscillator component Per2 itself is controlled by an antisense lncRNA termed *asPer2*. Reports about the brain-derived regulation of circadian metabolism remain scarce yet the pathogenesis of Prader–Willi syndrome (PWS), a CNS-controlled genetic disorder circadian rhythm with an associated dysregulation of metabolism and the development of obesity, was shown to be influenced by a PWSassociated lncRNA called *116G.*After splicing, a lncRNA consisting of the remnants of *116G* (termed *116HG*) bound to the transcriptional activator RBBP5 and ensures a physiological circadian rhythm in the brain. Mice deficient for *116HG* exhibit metabolic disorders due to the dysregulation of diurnally expressed circadian genes like *Clock*, *Cry1,* and *Per2* in the CNS (Powell et al., 2013).

#### **EMERGING CONCEPT: INTERCELLULAR COMMUNICATION BY EXOSOMAL lncRNAs?**

Exosomes are small vesicles generated by budding of the plasma membrane and constitute a specific vehicle for intercellular communication. Upon release from donor cells, exosomal surface motifs serve as "address codes" for binding and endocytosis on acceptor cells. Specific exosomal shuttling RNAs (esRNAs) such as miRNAs can be packaged into exosomes and released after binding to recipient cells, thus constituting a novel and intriguing way for ncRNAs to regulate systemic aspects of metabolism (Ramachandran and Palanisamy, 2012). Similarly, the intercellular transport of high-density lipoproteins (HDL)-bound miRNAs that are released by distinct donor cells influence the miRNA profile of acceptor cells and concomitantly alter the gene expression in HDL-recipient target tissues (Vickers et al., 2011). Of note, deep sequencing of human exosomes revealed that lncRNAs are localized within micro-vesicles and may emerge as novel means of cellular communication (Huang et al., 2013). Although experimental proof of concept is still lacking, the endocrine transfer of exosomal lncRNA might represent a novel facette relevant for lncRNA-mediated control of metabolism.

### **THERAPEUTIC OPPORTUNITIES OF lncRNA INHIBITION**

A high economic interest lies in the development of sequencespecific compounds for the inhibition of disease-associated ncR-NAs. Short, chemically modified ribonucleic acid compounds like locked nucleic acids (LNAs) efficiently silence the expression of ncRNAs such as miRNAs and are generally well tolerated *in vivo* (Krutzfeldt et al., 2005; Esau et al., 2006; Elmen et al., 2008). These anti-RNA compounds were initially tested in mice (Krutzfeldt et al., 2005) and adopted to the non-human primate situation (Elmen et al., 2008) with unprecedented speed. This approach will hopefully be extended to disease-associated lncRNA in the near future. Most *in vivo* studies to date concentrated on disease-associated miRNAs, that were critically involved in the development of insulin resistance and the deterioration of metabolic health (Jordan et al., 2011; Trajkovski et al., 2011; Zhou et al., 2012; Kornfeld et al., 2013). In contrast, most insights concerning the metabolicfunctions of lncRNAs were inferredfrom *in vitro* studies. The rising numbers of lncRNA knockout models

Kornfeld and Brüning Metabolism and lncRNAs

[exemplified by a recent report on 18 lncRNA loss-of-function mouse models (Sauvageau et al., 2013)] showcase that in order to convincingly assess, whether lncRNAs are implicated in the *in vivo* control of metabolism, further animal models for lncRNA lossand gain-of-function are needed. This is of timely importance as systemic antisense oligonucleotide (ASO)-mediated inhibition of disease-associated lncRNAs (even in difficult to target organs like skeletal muscle) effectively improves degenerative diseases like myotonic dystrophy type 1 (DM1) in mice (Wheeler et al., 2012).

#### **REFERENCES**


metabolic diseases in mice fed a high-fat diet. *Cell Metab.* 15, 848–860. doi: 10.1016/j.cmet.2012.04.019


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 January 2014; accepted: 05 March 2014; published online: 25 March 2014. Citation: Kornfeld J-W and Brüning JC (2014) Regulation of metabolism by long, non-coding RNAs. Front. Genet. 5:57. doi: 10.3389/fgene.2014.00057*

*This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Kornfeld and Brüning. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Long non-coding RNAs learn the importance of being *in vivo*

# *Jhumku D. Kohtz\**

*Developmental Biology and Department of Pediatrics, Lurie Children's Research Center, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA \*Correspondence: j-kohtz@northwestern.edu*

#### *Edited by:*

*Yingqun Huang, Yale University School of Medicine, USA*

#### *Reviewed by:*

*Marie-Louise Hammarskjold, University of VIrginia, USA*

**Keywords: Evf2, lncRNA knockout mice, neural development, transcription, lncRNA antisense regulation, pain modulation, microbial susceptibility, lung cancer**

In the past few years, the long non-coding RNA (lncRNA) field has been dealt some major surprises. While some phenotypes of mice lacking lncRNAs reveal potential targets for treating diverse human diseases, others do not match the expectations from experimental manipulations in cell lines reported over the last 10 years. In effect, it has become clear that principles learned about lncRNA functions in cell lines can be very different when tested in animal models (*in vivo)*.

The imprinting/dosage compensation and developmental biology fields, older and wiser crowds, are rolling their eyes.

Historically, there were a small number of well-characterized lncRNAs. Among these were the classic lncRNAs *Xist*, *roX, H19, Air, KCNQ1OT1,* and *UBE3a* (Lee and Bartolomei, 2013). These lncR-NAs regulate imprinting and/or dosage compensation, and were studied almost exclusively in animal models (mice or Drosophila). In 2006, elegant studies in Drosophila showed that *trans*-acting lncR-NAs (TRE's) regulate transcription of the Ubx1 homeodomain transcription factor (Sanchez-Elsner et al., 2006). Along with the explosion of lncRNAs identified in the genomic era, *trans*-acting transcriptional activities of vertebrate lncRNAs, *SRA* (Lanz et al., 1999), *Evf2* (Feng et al., 2006), and *HOTAIR* (Rinn et al., 2007) were reported. However, unlike the majority of previous lncRNA experiments, initial *SRA*, *Evf2* and *HOTAIR* studies relied on cell lines to assay for lncRNA activity. "*Trans*" activities gave these lncRNAs the potential for global effects, distinguishing them from their *cis*-acting imprinting/dosage compensating counterparts.

In 2004, the *Nature* editor refused to send our paper on *Evf2* lncRNA *trans*acting transcriptional activity out for peer review, stating that a knockout mouse model was necessary. This was not unexpected, as "knockout first, ask questions later," had been the *modus operandi* at the NYU Skirball Institute, where scientists (including myself) were indoctrinated regarding the importance of *in vivo* studies. Thankfully, Terry Grodzicker, the editor at *Genes and Development* did not share the *Nature* editor's views, and agreed to send our paper out for peer review. This led to publication of our work on *Evf2 trans*-acting activity in 2006 (Feng et al., 2006).

In retrospect, views at Skirball and *Nature* may have been correct: *Evf2* cell line assays predicted lncRNA enhancer activation in *trans* (Feng et al., 2006), while *Evf2*TS*/*TS mice (lacking *Evf2*) a few years later indicated lncRNA repression in *cis* (Bond et al., 2009). In mice, *Evf2* recruits both transcriptional activator (DLX's) and repressor (MECP2), and through antisense regulation represses adjacent gene expression (Bond et al., 2009). Recent experiments show that *Evf2* prevents enhancer CpG site-specific methylation, in *trans*, but that methylation effects may not be sufficient to regulate gene expression (Berghoff et al., 2013). Both loss-offunction and gain-of-function *Evf2* mouse models, as well as additional mouse models lacking Dlx1/2 and Mecp2, support the proposed mechanism (Berghoff et al., 2013). Relevant to ongoing studies, mice lacking *Evf2* have reduced inhibition in the adult brain, resulting from developmentally generated interneuron defects (Bond et al., 2009). Taken together, *Evf2* work suggests that lncRNA-dependent positive and negative transcription factor recruitment and enhancer DNA methylation inhibition contribute to gene dosage regulation, rather than essential gene regulation (Mattick, 2013). Mice lacking *Evf2* exhibit a different adult phenotype than would have been predicted from studies in cell lines. While demonstrating *Evf2* activity in cell lines was critical in prompting and designing subsequent work, present models for the role of *Evf2* in transcription and neuronal development rely on results obtained in mice.

Mice lacking the well-characterized lncRNAs, *NEAT1,* required for paraspeckles (Nakagawa et al., 2011), *MALAT1*, localized in nuclear speckles, (Nakagawa et al., 2012), or *HOTAIR*, recruitment of histone modification complexes that regulate Hox genes (Schorderet and Duboule, 2011; Li et al., 2013), also challenge previous data obtained in cell lines.

Loss of *NEAT1* in mice shows that paraspeckles, previously thought to be a critical subnuclear compartment, are not necessary for mouse development (Nakagawa et al., 2011). Mice lacking *MALAT1* (*NEAT2*), previously thought to be critical for nuclear speckles and splicing, show no morphological alterations (Nakagawa et al., 2012). However, a dramatic phenotype is reported in *MALAT1* conditional knockout (cKO) mice and in mice treated with anti-sense oligos to *MALAT1* (Eissmann et al., 2012). Both methods to reduce *MALAT1* substantially reduce lung tumor metastasis (Eissmann et al., 2012). In the MALAT1cKO model, gene expression adjacent to *MALAT1* is affected, but not global splicing (Eissmann et al., 2012). Since MALAT1cKO mice remove a piece of DNA in addition to removing the MALAT1 transcript, *cis*gene effects resulting from DNA loss cannot be distinguished from RNA loss. The latter effect will need to be tested in the Nakagawa MALAT1 mice where a triple polyA (Transcription Stop, TS, Soriano, 1999) insertion prevents lncRNA expression.

There is a similar problem with *HOTAIR* loss-of-function mouse models (Schorderet and Duboule, 2011; Li et al., 2013), as well as a recent screen for novel lncRNAs (Sauvageau et al., 2013), where DNA deletion rather than TS insertion is utilized. In HOTAIRcKO mice, removal of both HoxC and *HOTAIR* does not change HoxD H3K27me3 profile or gene expression in E13.5 embryos (Schorderet and Duboule, 2011). HOTAIRcKO skeletal phenotypes and gene regulatory phenotypes are mild, with 2-fold or less changes in HoxD10 and HoxD11 expression (Li et al., 2013). However, when *HOTAIR*−*/*<sup>−</sup> cells are placed in culture, significant differences in HoxD gene expression and H3K27me3 profile are detected, suggesting different roles of *HOTAIR* in cell lines and *in vivo* (Li et al., 2013).

Given that so many of the recent lncRNA models use cKO's to remove lncRNA from mice, an important point to address here is how lncRNA biologists choose to remove lncRNA expression from mice. cKO mice using cre-directed removal have the advantage of tissue—and developmental—stage-specific loss, avoiding prenatal and heterozygote lethality. However, in the absence of rescue, determining whether phenotypic effects result from RNA or DNA loss is not possible. If an lncRNA works in *cis*, rescue is unlikely to change gene expression. One example is our transgenic rescue experiments, where *Evf2* expressed from a transgene in mice lacking endogenous *Evf2* (*Evf2*TS*/*TS) rescues enhancer methylation, but not *cis* gene expression effects (Berghoff et al., 2013).

In addition to avoiding DNA removal, TS insertion is an efficient means of terminating lncRNA transcription, as first reported for *Tsix* (96% *Tsix* RNA reduction) (Luikenhuis et al., 2001); TS insertion was also used to terminate *AIR* expression in mice and determine the role of *AIR* in imprinting in mice (Sleutels et al., 2002). A number of lncRNA models, including *Evf2*TS*/*TS (Bond et al., 2009) have successfully used TS to terminate lncRNA expression. Therefore, unless embryonic lethality of heterozygotes is predicted, TS insertion is the method of choice for preventing lncRNA transcription in mice.

Two very different and exciting reports of lncRNA *in vivo* significance were recently published (*NeST* Gomez et al., 2013 and *Kcna2AS* Zhao et al., 2013). In the first report, *NeST*, an lncRNA encoded by the murine viral susceptibility locus, *Tmevp3*, controls Salmonella susceptibility and alters interferon-γ H3K4me3 (Gomez et al., 2013). *NeST* was identified based on differences in microbial susceptibility between two congenic strains of mice (B10.S and SJL/J), and demonstrates the power of genetics and lncRNA biology when combined (Gomez et al., 2013). The *Kcna2AS* is an antisense lncRNA that negatively regulates Kcna2, a voltagedependent potassium channel expressed in afferent neurons (Zhao et al., 2013). Knockdown of *Kcna2AS* reduces neuropathic pain in a rat model, identifying a novel target for pain modulation (Zhao et al., 2013). Results from *NeST*, *Kcna2AS*, and *MALAT1* lncRNAs have major implications in developing treatments for infectious, and neurological disease, as well as lung cancer.

While the arguments for utilizing mouse models to study lncRNA mechanism and significance are clear, there are several arguments, in addition to the discovery argument, to continue studies in cell lines. For instance, in the field of regenerative medicine, lncRNAs have the potential to guide human or mouse embryonic stem cells toward specific lineages, or reprogram induced pluripotent stem cells. Work on lncRNAs controlling retinal fate specification in mice *RNCR2 and Six3OS* (Rapicavoli et al., 2010, 2011), predicted that lncRNAs may be used to guide embryonic stem cell differentiation, *in vitro*. Although its role in vivo has yet to be determined, the Braveheart (*Bvht*) lncRNA directs cardiovascular lineage commitment in embryonic stem cells (Klattenhoff et al., 2013), a holy grail in the cardiac field. While studies of human—or primate-specific lncRNAs may not yield useful information in rodent models, manipulation in human embryonic stem cells may reveal their functions. The crossinformation obtained from *in vitro* and *in vivo* studies are likely to be most powerful when generated in the right system for the right purpose.

# **CONCLUSIONS**

Although one may dispel the differences between lncRNA activities in cell lines and *in vivo* described above as a biological anomaly, such differences are not specific to lncRNA studies. The REST conditional mouse knockout serves as a salient example of how a whole field can be surprised and challenged when a key *in vivo* experiment refutes previous dogma (Aoki et al., 2012). Going against a long-standing belief that REST plays a critical role in neurogenesis, Aoki et al. (2012) show that REST is only required to repress neuronal genes in non-neuronal cells, but not in neuronal progenitors, *in vi*vo.

Determining biological significance using *in vivo* models is not only important to grant reviewers, NIH program officers, the editors of some journals, and human disease, but it is important for answering questions that eventually establish the basic principles in the field. In the case of modern lncRNAs, mechanistic studies in cell lines have so far outweighed studies in mice. However, multiple *in vivo* models are shaking up some of the previous lncRNA dogma, revealing lncRNA biological significance and functional diversity, as well as guiding the future of the lncRNA field.

#### **ACKNOWLEDGMENTS**

Jhumku D. Kohtz is funded by NIMH R01MH094653.

#### **REFERENCES**


brain ncRNA is critical for adult hippocampal GABA circuitry. *Nat. Neurosci.* 12, 1020–1027. doi: 10.1038/nn.2371


and gene derepression. *Cell Rep.* 5, 3–12. doi: 10.1016/j.celrep.2013.09.003


*Received: 31 January 2014; accepted: 11 February 2014; published online: 04 March 2014.*

*Citation: Kohtz JD (2014) Long non-coding RNAs learn the importance of being in vivo. Front. Genet. 5:45. doi: 10.3389/fgene.2014.00045*

*This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Kohtz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Molecular mechanisms of long ncRNAs in neurological disorders

# *Dubravka Vuˇci´cevi´c1, Heinrich Schrewe2 and Ulf A. Ørom1\**

<sup>1</sup> Otto Warburg Laboratory, Max Planck Institute for Molecular Genetics, Berlin, Germany

<sup>2</sup> Department of Developmental Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany

#### *Edited by:*

Yingqun Huang, Yale University School of Medicine, USA

#### *Reviewed by:*

King-Hwa Ling, Universiti Putra Malaysia, Malaysia Zhengyu Jiang, Fox Chase Cancer Center, USA

#### *\*Correspondence:*

Ulf A. Ørom, Otto Warburg Laboratory, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany e-mail: oerom@molgen.mpg.de

Long non-coding RNAs (ncRNAs) have added an unexpected layer of complexity in the regulation of gene expression. Mounting evidence now links long ncRNAs to fundamental biological processes such as development and differentiation, and recent research shows important involvement of long ncRNAs in a variety of diseases including neurodegenerative disorders, such as Parkinson's, Alzheimer's, spinocerebellar ataxia, and Huntington's diseases. Furthermore, long ncRNAs are speculated to be implicated in development of psychiatric disorders such as schizophrenia and bipolar disorders. Long ncRNAs contribute to these disorders in diverse ways, from regulation of transcription to modulation of RNA processing and translation. In this review, we describe the diverse mechanisms reported for long ncRNAs, and discuss how they could mechanistically be involved in the development of neurological disorders.

**Keywords: neurological disorders, long non-coding RNA, protein-RNA interaction, ncRNA, brain development**

# **INTRODUCTION**

Recent technological advances such as next generation sequencing have revealed pervasive transcription of mammalian genomes (Djebali et al., 2012). It has been reported that, whereas only a small fraction of the human genome codes for proteins, 60% is being transcribed into transcripts without protein coding capacity (Derrien et al., 2012; Djebali et al., 2012). The majority of these transcripts are referred to as long non-coding RNAs (ncRNAs). The transcripts are often annotated as such judged by the lack of an appreciable open reading frame (Derrien et al., 2012).

Although only a very small fraction of annotated long ncRNAs has been well characterized, these examples show an involvement at every level of the gene expression program (Ulitsky and Bartel, 2013). Long ncRNAs have been reported to occur both as spliced, polyadenylated, and capped transcripts often transcribed by RNA polymerase II, resembling mRNAs in their physical structure (Derrien et al., 2012), and to be non-polyadenylated single-exon transcripts often involved in enhancerfunction (Orom and Shiekhattar, 2013). In the current review, we focus on the former group of long ncRNAs, and provide an overview of their involvement in neurological disorders.

A dominating view is that long ncRNAs often work in complex with proteins to bring about regulatory functions (Rinn et al., 2007; Tripathi et al., 2010; Bertani et al., 2011; Gong and Maquat, 2011; Wang et al., 2011; Lai et al., 2013) emphasizing one of the areas of intensive research. Many long ncRNAs have been shown to bind and guide chromatin remodeling factors to specific *loci* in the genome (Rinn et al., 2007; Bertani et al., 2011; Wang et al., 2011). Guttman et al. (2011) speculated that long ncRNAs can provide targeted specificity of individual chromatin remodelers in different cellular settings. Long ncRNAs have also been shown to bind more chromatin remodelers at the same time to coordinate their activities (Tsai et al., 2010). In addition, there are several examples of

long ncRNAs regulating expression of genes post-transcriptionally (Geisler and Coller, 2013; Ulitsky and Bartel, 2013).

A large fraction of tissue specific long ncRNAs are expressed in the brain (Derrien et al., 2012). Furthermore, the majority of brain specific long ncRNAs is specifically expressed in particular regions, cell types or even subcellular compartments (Mercer et al., 2008, 2010; Derrien et al., 2012), suggesting specific regulatory roles in subsets of specialized cells. For many of these long ncRNAs it has been shown that they are functionally implicated in brain development. Long ncRNA metastasis associated lung adenocarcinoma transcript 1 works by regulating the activity of splicing factors, and controling the expression of genes involved in synapse formation, density, and maturation (Bernard et al., 2010). Additionally, a growing number of long ncRNAs has been shown to regulate expression of genes/proteins with crucial roles in neurological disorders (see **Table 1** for an overview of long ncRNAs involved in neurological disorders reviewed here in detail).

# **LONG ncRNAs REGULATE TRANSCRIPTION OF GENES ASSOCIATED WITH NEUROLOGICAL DISORDER**

Long ncRNA antisense non-coding RNA in the INK4 locus (ANRIL) has been associated to hereditary cutaneous malignant melanoma, prostate cancer and tumors of the neural system (Pasmant et al., 2011). Furthermore, genome wide association studies have identified the ANRIL gene as a risk locus for coronary disease, intracranial aneurism, type 2 diabetes and several cancers including glioma (Pasmant et al., 2011). ANRIL is an antisense RNA transcript overlapping the INK4b/ARF/INK4a locus and participates directly in its epigenetic repression. The INK4b/ARF/INK4a locus encodes for p15, p16, and the p14ARF protein, three major players in cell fate determination (Pasmant et al., 2011). p15 and p16 are major players in the retinoblastoma (Rb) signaling pathway. Their inactivation in cells leads to inactivation of Rb, a



well-studied tumor suppressor protein, and progression through the cell cycle. p14ARF activates Rb as well as the tumor suppressor p53 by promoting the degradation of MDM2. Its inactivation can also lead to cell cycle arrest (Pasmant et al., 2011). The components of the INK4b/ARF/INK4a locus are repressed by both polycomb repressive complex 1 (PRC1) and PRC2 repressive complex (Popov and Gil, 2010). Yap et al. (2010) showed that by binding to the CBX7 subunit of the PRC1 complex, ANRIL compromises its capacity to repress the INK4b/ARF/INK4a locus and control senescence in mouse embryonic fibroblasts (Pasmant et al.,2011). These data indicate that ANRIL regulates a gene locus that codes for major players involved in control of cell cycle progression and disease.

The antisense long ncRNA BDNF-AS regulates the expression of the sense strand encoded brain derived neurotrophic factor (BDNF). This protein belongs to a class of secreted growth factors that are essential for neuronal growth, maturation, differentiation, and maintenance. Its expression is impaired in neurodegenerative as well as psychiatric disorders. For example, Huntington's disease (HD) patients have reduced levels of BDNF. Recently it was shown that knock-down of BDNF-AS resulted in upregulation of BDNF (Modarresi et al., 2012). BDNF-AS mediates its effect via PRC2. PRC2 represses gene expression through methylation of Lysine 27 of histone H3 (H3K27me2/3) by its catalytic subunit enhancer of zeste homolog 2 (EZH2) (Czermin et al., 2002). It was shown that upon knock-down of BDNF-AS the occupancy of EZH2 as well as H3K27me3 was reduced at the BDNF promoter (Vashishtha et al., 2013). Thus, BDNF-AS inhibits BDNF transcription by recruiting EZH2 to the BDNF promoter region and in that way plays an important role in the development of HD.

A recent study indicated that a subset of long ncRNAs, called activating long ncRNAs (RNA-a), is associated with Opitz– Kaveggia (also known as FG) syndrome, a X-linked intellectual disability syndrome, characterized by various neuronal pathologies as well as developmental abnormalities. It was shown that the Mediator complex is recruited to ncRNA-a target genes via its MED12 subunit, and regulates their expression (**Figure 1**). Mediator complexes containing missense mutant MED12 proteins corresponding to FG syndrome fail to associate with ncRNA-a (Lai et al., 2013), which might explain how these Mediator mutations can cause disease. Mediator is a evolutionary conserved multiprotein complex that controls transcription by RNA Polymerase II and acts as a key regulatory interface for the integration of activating and repressing signals at promoters and distal enhancers (Carlsten et al., 2013). The interaction of Mediator and long ncRNAs is shown to be essential for recruitment of the complex to the promoter of target genes and the H3S10 kinase activity of the Mediator complex, involved in its activating properties (Meyer et al., 2008). Loss of Mediator-ncRNA interaction might be a possible contributing factor for the neurological pathologies in FG patients. Taken together, ncRNA-a could have a prominent role in gene activation and development of FG syndrome due to its interaction with the Mediator complex.

Evf-2 is a long ncRNA that is transcribed from an ultraconserved enhancer in the Dlx-5/Dlx-6 locus that is important for proper brain development. Evf-2 regulates transcription of this unit by interacting with an activating as well as with a repressing transcription factor. Evf-2 forms *in vivo* a complex with the homeodomain containing protein Dlx-2 to activate transcription of the Dlx5/6 enhancer (Feng et al., 2006). It also recruits

**disorders.** Long ncRNAs can regulate every level of gene expression. Shown is a summary of selected long ncRNA functions discussed in

space limitations not all long ncRNAs discussed in the review are included.

the repressive methylation binding protein MECP2 to the same locus. Furthermore, *Evf2* prevents CpG methylation at the Dlx-5/Dlx-6 locus, suggesting that methylated CpG sites are not responsible for MECP2 recruitment (Berghoff et al., 2013). The relationship between recruitment of MECP2 and prevention of CpG methylation by Evf2 is not clear yet and needs to be further explored. Nevertheless, loss of function of Evf2 leads to a decrease in the number of GABAergic interneurons in the early postnatal mouse hippocampus and dental gyrus. Malfunctions in GABAergic interneurons have been implicated in a number of neurological disorders including autism, schizophrenia and epilepsy (Kohtz and Berghoff, 2010). Thus, it has been speculated that Evf2 plays a role in the development of the described disorders. Additionally, malfunctions in GABAergic interneurons have been observed in Rett-syndrome, a X-linked neurological disorder affecting females 1:10,000. Mecp2 knock-out mice as a model for Rett-syndrome show extensive dysregulation of long ncRNAs (Petazzi et al., 2013). As Evf-2 appears to control the development of GABAergic interneurons, it is the subject of many studies that will hopefully help to better understand the disorders with malfunctions in these neurons and pinpoint to novel therapeutics.

Long ncRNA HTTAS\_v1 is regulating the expression of Hungtiontonin (HTT) and is potentially involved in the development of HD (Chung et al., 2011). HTT is a protein that has a central role in the development of HD that is believed to be partially caused by trinucleotide repeat expansions in the gene coding for HTT. HTTAS\_v1 is transcribed antisense to HTT and one of its exons includes the repeat. Overexpression of HTTAS\_v1 leads to a reduction in HTT transcript levels whereas depletion leads to an increase in HTT transcript levels. This effect is dependent on the repeat length. Furthermore, transcript levels of HTTAS\_v1 are reduced in frontal cortex of patients who suffer from HD, indicating that HTTAS\_v1 might be an important long ncRNA contributing to the development of this neurological disorder (Chung et al., 2011).

Long ncRNA SCAANT1 is implicated in a type of polyglutamine disorder, spinocerebellar ataxia type 7 (SCA7). Spinocerebellar ataxias are a group of neurological disorders affecting the cerebellum. SCA7 is caused by CAG repeat expansion in ataxin-7 gene. Long ncRNA SCAANT1 is transcribed antisense to ataxin 7. Lack of SCAANT1 leads to an increase in ataxin 7 transcription causing a development of SCA7 in mice. Furthermore, proximal CTCF binding is required for SCAANT1 transcription. Thus, SCAANT1 acts a repressor of ataxin 7 transcription in a CTCF dependent manner and is a potential player in development of SCA7 (Sopher et al., 2011).

Long ncRNA 116HG has been shown to play a role in the development of Prader–Willi syndrome (PWS) (Powell et al., 2013). This syndrome is a neurological disorder caused by the paternal deletions of some genes on chromosome 15, including the gene that codes for the long ncRNA 116HG. Mice lacking this transcript show most of the symptoms characteristic for PWS. Long ncRNA 116HG forms a cloud in the nuclei of both mouse and human neurons (Powell et al., 2013). The cloud is formed at the site of the transcription of this long ncRNA and this ncRNA co-purifies with RBBP5 a component of mixed lineage leukemia (MLL1) activating chromatin remodeling complex. Since loss of this long ncRNA led to an up-regulation of many genes Powell and colleagues suggested that 116HG long ncRNA might act as a decoy for RBBP5 and in this way disable it to activate transcription of these genes (Powell et al., 2013). Additionally, metabolic analyses suggested that this long ncRNA regulates diurnal energy expenditure of the brain. In conclusion, long ncRNA 116HG regulates the expression of many genes potentially through interacting with RBBP5 and might help to balance energy consumption.

#### **LONG ncRNAs REGULATE PROCESSING OF mRNAs**

ATXN8OS is a long ncRNA localized in GABAergic interneurons (Moseley et al., 2006) and plays a significant role in the development of SCA8, a type of ataxia caused by repeat expansion in ATXN8OS and ATXN8. The ATXN8OS ncRNA shares a bidirectional promoter with ATXN8 that encodes a protein known to contribute to the development of SCA8. Both the ATXN8OS and ATXN8 in SCA8 undergo a gain of function due to (CTG)n repeat expansions (Moseley et al., 2006; Daughters et al., 2009). Long ncRNA transcripts with trinucleotide expansion co-localize in GABAergic neurons with the muscleblind-like splicing regulator 1 (MBLN1) and cause changes in its localization and splicing regulatory activity. As a consequence, GABA-A transporter 4 RNA undergoes alternative splicing leading to loss of GABAergic inhibition, characteristic for SCA8 (Sopher et al., 2011).

A potential contributor to the development of Alzheimer's disease (AD) is long ncRNA 17A (Massone et al., 2011). This long ncRNA is transcribed by RNA polymerase III (Pol III) and is an antisense transcript of human G-protein-coupled receptor 51 gene (GPR51; Massone et al., 2011). Depending on alternative splicing events, this gene codes for a functional GABA B2 receptor or unfunctional GABA R2. In a human neuroblastoma cell line stable expression of long ncRNA 17A induced the production of unfunctional alternative splice isoforms for GABA R2, leading to the abolishment of GABA B2 intracellular signaling and secretion of amyloid-β peptide, characteristic for AD (Kim et al., 2013). Similarly, in cerebral cortex of AD patients 17A is upregulated and the functional GABA B2 receptor could not be detected suggesting that 17A and abolishment of GABA B2 signaling might play a role in the development of AD (Massone et al., 2011).

It has been shown that long ncRNA Gomafu (MIAT, RNCR2) plays a role in retinal cell development, brain development and post-mitotic neuronal function (Tsuiji et al., 2011; Barry et al., 2013). It localizes to specific subset of neurons in adult mice, including the CA1 region of the hippocampus and large cortical neurons. It is localized in the compartment of the nucleus enriched in splicing factors (Tsuiji et al., 2011; Barry et al., 2013). This non-coding RNA has a distinctive feature: tandem repeats of UACUAAC, a conserved intron branch point that binds to the SF1 splicing factor (Tsuiji et al., 2011). Gomafu also binds directly two additional splicing factors QKI and SRSF1. Dysregulation of this long ncRNA leads to alternative splicing patterns of DISC1 and ERBB4 (**Figure 1**). These alternative splicing patterns are similar to those observed in schizophrenic disorder. Furthermore, Gomafu is dysregulated in the cortex of schizophrenic subjects. Collectively these results indicate that Gomafu may contribute to development of schizophrenia disorder (Barry et al., 2013). In addition, Gomafu is upregulated in the region of the brain involved in behavior and addiction of cocaine and heroine users, suggesting that Gomafu might also have a role in behavioral abnormalities (Albertson et al., 2006).

With the great diversity of alternative splice forms in the human genome many more examples of long ncRNAs regulating alternative splicing of both mRNAs and other RNA species should be expected to be identified and characterized soon.

#### **LONG ncRNAs REGULATE mRNA STABILITY**

Another long ncRNA demonstrated to play a role in AD is BACE1-AS. This long ncRNAs is transcribed antisense to βsecretase-1 protein (BACE1) and regulates BACE1 mRNA stability (Faghihi et al., 2008). BACE1 is an enzyme that generates amyloidβ that clusters in amyloid plaques that are a histological hallmark of AD. Recently, a study in mouse AD model revealed that in this clustered form amyloid-β triggers the erosion of synaptic connections between neurons which are crucial for proper functioning of the brain and AD pathophysiology (Kim et al., 2013). Upon stress stimuli BACE1-AS gets upregulated and increases BACE1 mRNA stability by duplexing with BACE1 mRNA, leading to the generation of additional BACE1 enzyme and amyloid-β (**Figure 1**; Faghihi et al., 2008). The levels of BACE1-AS are elevated in subjects with AD and its *in vivo* knock-down in mouse brain led to the downregulaton of BACE1 protein levels, reduction in amyloid-β synthesis and aggregation in the brain, signifying the importance of BACE1-AS for the development of AD (Modarresi et al., 2011).

This is an example of a long ncRNA that is reported to be acting without a protein partner, and thus represents an alternative view on the mechanism of long ncRNAs. This could be a more general property of a class of long ncRNAs that should be studied more extensive in future research.

#### **LONG ncRNAs REGULATE TRANSLATION**

BC1 in rats and BC200 in humans, are two long ncRNAs that are compartmentalized in synaptodendrites as ribonucleoprotein particles contributing to the regulation of local protein synthesis. BC200 seems to be linked to AD development; patients suffering from AD show higher expression of BC200 in the affected area of their brain (Brodmann's area 9), compared to same aged healthy controls. Furthermore, the levels of BC200 increase with the severity of AD in this area of the brain. Additionally, in advanced stages of AD BC200 mislocalized to the perikaryon (Mus et al., 2007). BC200 has been suggested to modulate gene expression at the translational level by interacting with different proteins: fragile X mental retardation protein (a translational repressor), poly(A)-binding protein 1 (a translation initiation regulator), heterogeneous nuclear ribonucleoprotein A2 (involved in transport of mRNAs in neurons), and synaptotagmin binding cytoplasmic RNA interacting protein (also involved in mRNA transport and potentially in local protein synthesis) (Muddashetty et al., 2002; Muslimov et al., 2006; Duning et al., 2008). Overexpression, mislocalization, as well as interaction with proteins involved in local protein synthesis and trafficking in neurons suggest BC200 to be an important player in the development of AD.

A long ncRNA transcribed antisense of the mouse ubiquitin carboxy-terminal hydrolase L1 (*Uchl1*) gene can induce the translation of Uchl1. Human UCHL1 is a neuron-restricted protein that acts as a de-ubiquitinating enzyme, ubiquitin ligase or monoubiquitin stabilizer, and its inactivation was reported in both AD and Parkinson's disease (PD) patients. Overexpression of antisense Uchl1 led to an increase in the abundance of UCHL1 protein without affecting its mRNA levels. Only a partial overlap between the long ncRNA and mRNA is required for this activity. Uchl1 mRNA localizes predominantly in the cytoplasm whereas the antisense ncRNA is enriched in the nucleus of dopaminergic neurons. When dopaminergic cells are treated with an mTOR inhibitor, antisense Uchl1 relocalizes to the cytoplasm, triggers the binding of Uchl1 mRNA to polysomes and an increase in UCHL1 protein level is observed (**Figure 1**) (Carrieri et al., 2012). Since in genetic and neurochemical models of PD, mTOR1 inhibition protects dopaminergic neurons from apoptosis it is possible that the UCHLE-ncRNAmTOR1 interplay might be important for the development of PD.

# **OTHER LONG ncRNAs THAT ARE POTENTIALLY INVOLVED IN NEUROLOGICAL DISORDERS**

Many other long ncRNAs are suspected to be involved in neurological disorders. Some of them are: TUG1 is upregulated in HD patients; PINK1-AS is potentially involved in the development of PD; Sox2OT whose gene carries the important regulator of neurogenesis gene in an alternatively spliced intron might serve as a biomarker for AD since it's expressed exclusively in early stages of AD; Ube 3a–AS has been implicated in Angelman's syndrome (genetic disorder that causes developmental disabilities and neurological problems) since it was suggested that it might regulate the expression of Ube3a that is mutated or deleted in this syndrome; ASFMR1, FMR4, and FMR6 long ncRNAs are downregulated in neurons of patients suffering from fragile X syndrome (genetic disorder that causes a range of developmental problems including learning disabilities and cognitive impairment) but not in healthy individuals and thus might play a role in development of this disorder; DISC2 long ncRNA might contribute to the development of schizophrenia disorder since it is disrupted by a translocation in this disorder (Pastori and Wahlestedt, 2012; Fenoglio et al., 2013; Ng et al., 2013; Pastori et al., 2014). Addtionaly, Lipovich et al. (2013) identified eight human brain specific

long ncRNAs whose expression is changing in an age-related manner.

Long ncRNA NEAT1\_2 has been shown to contribute to the development of amyotrophic lateral sclerosis (ALS), a motor neuron disease (Nishimoto et al., 2013). One of the proteins mutated and contributing to the development of ALS are two DNA/RNA binding proteins: TAR DNA-binding protein-43 (TDP-43) and fused in sarcoma/translocated in liposarcoma (FUS/TLS; Lagier-Tourenne and Cleveland, 2009). Recently it was showed that both TDP-43 and FUS/TLS are bound by and co-localize with the long ncRNA NEAT1\_2. This long ncRNA is essential for the formation of nuclear bodies called paraspeckles and was shown to be upregulated in human motor neurons in early stage of ALS (Nishimoto et al., 2013). Thus, NEAT1\_2 might contribute to the development of early stage of ALS through interaction with TDP-43 and FUS/TLS.

Long ncRNAs could also be involved in the development of HD, in which long ncRNAs HAR1F and HAR1R are affected (Pollard et al., 2006). Human accelerated regions (HARs) are fast evolving non-coding sequences in the human brain often found in the proximity of neurodevelopmental genes like GATA3. It was suggested that they might potentially participate in unique human brain functions (Pollard et al., 2006). Of these the most dramatic accelerated changes were found in the HAR1 locus that codes for the two long ncRNAs HAR1F and HAR1R (Pollard et al., 2006). The expression of both can be repressed by the RE-1-silencing transcriptional factor (REST) that pathologically (in HD) translocates to the nucleus and represses important neuronal genes in neuronal cells (Johnson et al., 2010). Future studies are needed to shed light on the mechanism of HAR1 long ncRNAs and their precise contribution to the development of HD.

# **PERSPECTIVES**

The repertoire of diverse functions of long ncRNAs has contributed to an increased understanding of gene regulation. Long ncRNAs are involved in brain functions in both normal and diseased state, adding an additional layer of complexity to brain function. The number of long ncRNAs has been proposed to correlate with the complexity of the organism (Taft et al., 2007), and it is tempting to speculate that brain specific long ncRNAs might be evolutionary innovations that participate in human brain function.

The fact that a long ncRNA is differentially expressed in the healthy vs the disease brain or its expression correlates with a protein known to be involved in brain disorders could be due to various reasons that are unrelated to the disease or just unspecific side-effects. One way to study the functional relevance of long ncRNAs during brain development and in neurological disorders in physiological conditions is to generate mouse models with inactivated specific long ncRNA genes. Analysis of these mutant strains could demonstrate the distinct *in vivo* roles during embryonic development and disease. Further investigations of the long ncRNA mechanisms will help to better understand how the brain functions and how disorders develop, with the potential to further drug development based on manipulation of long ncRNA expression.

#### **ACKNOWLEDGMENTS**

Work in the author's laboratories is supported by the German Ministry for Research and Education through the Alexander von Humboldt Foundation (UAØ).

#### **REFERENCES**


by modulating SR splicing factor phosphorylation. *Mol. Cell* 39, 925–938. doi: 10.1016/j.molcel.2010.08.011


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 November 2013; accepted: 15 February 2014; published online: 04 March 2014.*

*Citation: Vuˇci´cevi´c D, Schrewe H and Ørom UA (2014) Molecular mechanisms of long ncRNAs in neurological disorders. Front. Genet. 5:48. doi: 10.3389/fgene.2014.00048 This article was submitted to Non-Coding RNA, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Vuˇci´cevi´c, Schrewe and Ørom. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

# OPEN ACCESS

Articles are free to read, for greatest visibility

### TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org