AUTHOR=Carmona Rosario , Zafra Adoración , Seoane Pedro , Castro Antonio J. , Guerrero-Fernández Darío , Castillo-Castillo Trinidad , Medina-García Ana , Cánovas Francisco M. , Aldana-Montes José F. , Navas-Delgado Ismael , Alché Juan de Dios , Claros M. Gonzalo TITLE=ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome JOURNAL=Frontiers in Plant Science VOLUME=6 YEAR=2015 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2015.00625 DOI=10.3389/fpls.2015.00625 ISSN=1664-462X ABSTRACT=

Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species.