Edited by: Gully A. Burns, University of Southern California, USA
Reviewed by: Leon French, Centre for Addiction and Mental Health, Canada; Jessica A. Turner, Georgia State University, USA
*Correspondence: Erinç Gökdeniz
Arzucan Özgür
Reşit Canbeyli
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Identifying the relations among different regions of the brain is vital for a better understanding of how the brain functions. While a large number of studies have investigated the neuroanatomical and neurochemical connections among brain structures, their specific findings are found in publications scattered over a large number of years and different types of publications. Text mining techniques have provided the means to extract specific types of information from a large number of publications with the aim of presenting a larger, if not necessarily an exhaustive picture. By using natural language processing techniques, the present paper aims to identify connectivity relations among brain regions in general and relations relevant to the paraventricular nucleus of the thalamus (PVT) in particular. We introduce a linguistically motivated approach based on patterns defined over the constituency and dependency parse trees of sentences. Besides the presence of a relation between a pair of brain regions, the proposed method also identifies the directionality of the relation, which enables the creation and analysis of a directional brain region connectivity graph. The approach is evaluated over the manually annotated data sets of the WhiteText Project. In addition, as a case study, the method is applied to extract and analyze the connectivity graph of PVT, which is an important brain region that is considered to influence many functions ranging from arousal, motivation, and drug-seeking behavior to attention. The results of the PVT connectivity graph show that PVT may be a new target of research in mood assessment.
Many studies have been conducted to identify the relations among brain regions in various species and this information is already available in the free text of the biomedical literature, albeit scattered in a large number of studies published over a sizable time period. Our aim is to propose a linguistically empowered approach by using natural language processing (NLP) techniques to automatically extract connectivity relations among brain regions from publications. By doing so, we target with the present study to obtain neuroanatomical connectivity among brain structures to be extended in subsequent studies to neurochemical and functional relations. After generating a map of connections, we will be in a position to automatically extract a brain region's relations and its effects on many functions such as arousal, motivation, depression and attention. As a case study, we focus on a specific brain region, the paraventricular nucleus of the thalamus (PVT), which belongs to midline and intralaminar group of thalamic nuclei and is long considered to have a non-specific effect on cortical arousal. Our main reason for choosing PVT as a particular target of research is that recent studies have begun to attribute more specific functions to this group of thalamic nuclei because of their rich neuroanatomical and neurochemical projections (Hsu and Price,
Most previous studies on text mining in the biomedical domain have focused on extracting information about proteins and genes from scientific publications. Shared tasks such as BioCreative (Krallinger et al.,
Developing text mining methods in the neuroinformatics domain for identifying brain region entities and mining the neuroanatomical relations among them is a relatively new research topic, compared to the more widely studied areas of biomedical text mining focusing on genes, proteins, and diseases. Only a handful of studies have been conducted in neuroscience text mining so far, most of which adapt and extend the methods proposed in the well-studied area of protein-protein interaction extraction. In the context of the Neuroscholar system, which is one of the first studies tackling the use of advanced NLP methods for neuroscience data mining, Burns et al. (
In the present paper, we propose a NLP based approach for neuroanatomical relation extraction from neuroscience publications. Unlike most previous neuroanatomical relation extraction studies that aimed at utilizing supervised machine learning based methods originally proposed for protein-protein interaction extraction, we target developing a high-precision knowledge-based linguistically motivated approach specifically designed for the neuroscience domain. Different from the rule-based method proposed in Richardet et al. (
Two different corpora were used in this research. The first is the WhiteText corpus that contains 3205 abstracts manually annotated for brain region mentions and the interactions among them (French et al.,
The second corpus that we compiled and used was PVT specific. A list of 558 PVT related publications is retrieved from PubMed by using the following query (on 14th of August, 2015).
“
The PVT corpus is used in two different ways during the evaluation. The abstracts of 451 publications (for which the full text was not publicly available) and 107 publicly available full text publications constituted the first data set and provided the basis for our application on the PVT case study. Secondly, 14 of these full text papers were selected by neuroscience domain experts and fully annotated with brain region mentions and connectivity statements. These 14 papers were selected randomly from a set of publications, which were known to be PVT related and included review papers. As the annotation guideline, we applied three steps. First, all brain region entities mentioned in the articles were annotated without regard to connectivity. Then all types of relations including neuroanatomical, neurochemical, and functional connections were marked. Lastly, we identified and evaluated only the neuroanatomical relations at the sentence level, when the text specifically mentioned identifiable connectivity between brain structures. Table
These experiments confirm projections from Pa, Pt, and other midline nuclei to the amygdala. | Pa | → | Amygdala |
These experiments confirm projections from Pa, Pt, and other midline nuclei to the amygdala. | Pt | → | Amygdala |
In addition, we found that the aPVT was strongly innervated by the ventral subiculum but this projection largely did not involve the pPVT. | aPVT | ← | Ventral subiculum |
The paraventricular thalamus (PVT), a midline thalamic nucleus, receives dense innervations from lateral hypothalamic orexin neurons (Peyron et al., |
PVT | ← | Hypothalamic orexin neurons |
We used a dictionary-based approach to identify the brain region entities that participate in neuroanatomical relations and normalized their mentions to canonical (unique) names. We constructed a dictionary of brain regions including their acronyms and synonyms, where an acronym is the abbreviation of the brain region entity and a synonym is a similar word or phrase used for the same brain region entity in text. A portion of the created dictionary with sample entries is shown in Table
Parietal lobe | PL | Parietal cortex, parietal region, lobus parietalis |
Suprachiasmatic nucleus | SCN | Suprachiasmatic nuclei |
Cingulate gyrus | CgG | Cingular gyrus, cingulate area, cingulate region, Gyri cinguli, Gyrus cinguli |
Subthalamus | SbTh | Subthalamic region, ventral thalamus, thalamus ventralis |
Parabrachial nucleus | – | Parabrachial nuclei, parabrachial |
Paracentral nuclues | PC | Paracentral thalamic nucleus, nucleus paracentralis, paracentral nucleus of the thalamus, paracentral |
During the dictionary creation step, we initially gathered a dictionary of 892 brain regions and 562 acronyms from the NeuroNames ontology (Bowden and Dubach,
The resulting enriched dictionary contains 3044 brain region entities with their synonyms and acronyms. The created brain region dictionary is made publicly available as Supplementary Materials for future text mining studies
We developed a linguistically motivated knowledge-based approach for neuroanatomical relation extraction. The workflow of the proposed approach is shown in Figure
The Stanford Core NLP tool (Manning et al.,
After preparing the data, the first phase of relation extraction was to scan the publications and extract the sentences that contained the predefined patterns. The extracted sentences at this step were the first candidates that might include neuroanatomical relations among brain regions.
We manually designed a set of patterns, which are strings of keywords that mostly reveal a relation, when there are two or more brain region entities in a sentence. Some of the patterns that were used in this research are “projection to, innervation of, receive input from, project from, efferent from.” For example, the following sentence contains a relation between the “dorsal midline thalamus” and “accumbens nucleus” brain regions signaled by the pattern “projection to.”
“
Neuroanatomical relations are in general signaled by pattern keywords. Since each keyword can have different prepositional suffixes (e.g., projection from, projection of, projection to) and different tenses (e.g., projects to, projecting to, projected to), regular expressions were used to cover the different usages of the patterns. As shown in the below regular expression for the pattern “project to,” the patterns were considered to be case insensitive and are likely to contain additional words between their original keywords (i.e., between “project” and “to”).
The list of designed patterns and the corresponding regular expressions are shown in Table
Innervate | (?i)innervat(e|es|ing){1} |
innervation of | (?i)innervation(s){0,1} of |
projection to | (?i)projection(s){0,1} to |
projection to from | (?i)projection(s){0,1} to ((\\w+)\\s){0,8} from |
projection of | (?i)projection(s){0,1} of |
projection target of | (?i)projection target(s){0,1} of |
projection from | (?i)projection(s){0,1} from |
projection from to | (?i)projection(s){0,1} from ((\\w+)\\s){0,8} to |
project to | (?i)project(ing|s|ed){0,1} ((\w)*){0,2}to |
project into | (?i)project(ing|s|ed){0,1} ((\w)*){0,2}into |
project from to | (?i)project(s|ed|ing){0,1} from ((\\w+)\\s){0,8} to |
receive input from | (?i)receiv(e|es|ing|ed){0,1} ((\w)*){0,4}input(s){0,1} ((\w)*){0,3}(from) |
receive fiber from | (?i)receiv(e|es|ing|ed){0,1} ((\w)*){0,4}fiber(s){0,1} ((\w)*){0,3}(from) |
receive innervation from | (?i)receiv(e|es|ing|ed){0,1} ((\w)*){0,4}innervation(s){0,1} ((\w)*){0,3}(from) |
receive [ae]fferent from | (?i)receiv(e|es|ing|ed){0,1} ((\w)*){0,4}[ae]fferent(s){0,1} ((\w)*){0,3}(from) |
traveling from to | (?i)travel(s|ling){0,1} ((\w)*){0,2}from ((\w)*){0,5}to |
exit through | (?i)exit(s|ing){0,1} ((\w)*)*through |
exit from | (?i)exit(s|ing){0,1} ((\w)*)*from |
After generating the list of sentences, which were candidates for hosting brain region relations, a detailed syntactic analysis of each sentence was done. There were two dependents of the patterns: agents and targets. If both of these dependents included brain region entities, then we considered that there was a relation between these entities. There could be more than one relation within a given sentence, if dependents included more than one brain region.
To be able to identify whether a dependent is an agent or target, we needed the directionality of the relation and this information was gathered directly from the patterns. For example, for the patterns like “receive input from, projection from, efferent from,” it is likely that the text string that follows the pattern is agent. On the other hand, for the “project into, innervate, terminate in” patterns, the same text reveals the target.
The Stanford Parser was used to syntactically parse the sentences and obtain their constituent elements (Klein and Manning,
In some sentences, the prepositional phrase (PP) following the detected NP modifies the NP and may contain candidate dependents for the relation. Therefore, if a detected NP was followed by a PP, which contains the keyword “including,” then it was also added as part of the candidate brain region text (dependent). An example sentence is provided below.
“
To find the first dependent (brain region candidate) that follows the pattern keyword, we used the constituency parser. On the other hand, for the second dependent, the text extraction phase was more complex. The second dependent can be found in different locations of the sentence. It can be at the beginning, right before the pattern, or close to the end of the sentence after the pattern. The dependency tree of a sentence can capture the long-distance relations among its words. We used the Stanford Dependency Parser (De Marneffe et al.,
A dependency was considered as
As the starting point of identifying the second dependent, when a pattern was found in a sentence, one of the dependency types below is searched in the dependency tree. The pattern keyword in these types could be either governor or dependent. The descriptions of all the dependency types can be found in the Stanford Parser dependencies manual with sample sentences and dependency trees
Direct object (dobj)
Nominal subject (nsubj)
Passive nominal subject (nsubjpass)
Controlling subject (xsubj)
Noun Compound Modifier (nn)
Reduced non-finite verbal modifier (vmod)
We worked on these grammatical relations under three different groups according to the sentence structures as described in the following sub-sections.
This rule set was applied for the pattern keywords that contain
If the pattern keyword was found as a dependent, then the Prepositional Modifier (
If the pattern keyword was found as a governor, all the relations that contained the dependent as a governor were selected. The dependents of these relations are retrieved as candidate brain regions. A portion of the dependency tree for a sample sentence, for which this rule applies, is presented in Figure
Sentence: “
Relations:
The candidate brain regions were returned in sorted order by their positions in the sentence:
the-12, external-13, lateral-14, parabrachial-15, subnucleus-16.
This specific case was an extension of the rule set described in the previous subsection for
Additionally, this rule was extended to consider the modifiers of the nominal subject. Each dependent retrieved from a
A portion of the dependency tree of the following sample sentence is shown in Figure
Sentence: “
Relations:
The candidate brain regions were returned in sorted order by their positions in the sentence:
anterograde-2, tracer-3, injection-4, dorsal-7, midline-8, thalamus-9.
This group of rules first found the
Sentence: “
Relations:
The candidate brain regions were returned in sorted order by their positions in the sentence:
orexin-29, neurons-30.
After the candidate generation phase (Section Candidate Generation Using NLP Techniques), the identified candidates were searched in the Brain Region Dictionary, in which a brain region (BR) was represented with its name, acronyms, and synonyms. A neuroanatomical relation was extracted, if at least two different brain region entities were matched in the dictionary, and one of them had the role of agent, whereas the other had the role of target. For the success of the dictionary matching process, we applied several steps as described below.
First we checked whether there was a full match between the agent/target and the dictionary entity (Step 1 in Table
1 | Text: “thalamus” |
Thalamus | thalamus | Full Match |
2 | Text: |
NAS |
NAS |
Full Match |
3 | Text: |
dorsal midline thalamus |
dorsal midline thalamus |
Full Match for SCN |
3.a | Candidate Brain Regions: |
No match for dorsal thalamus | ||
3.b | Candidate Brain Regions: |
Partial match for thalamus |
After finding the brain regions from the dictionary, only the longest version of the overlapping brain regions were selected. For example, if “thalamus” and “midline thalamus” are matched in the dictionary for a candidate brain region, then we selected “midline thalamus” as the extracted brain region. “Thalamus” was not selected, since it overlaps with “midline thalamus,” which is a longer match.
As the last step of relation extraction we defined whether the extracted brain regions were “full match” or “partial match” when compared with the annotated data set. If an extracted brain region matched only a part of the brain region in the annotated sentence, this was considered as a partial match. For example, assume that the application retrieved “thalamus” as a brain region and the manually annotated brain region text in the sentence was “dorsal midline thalamus.” In this case, the extracted brain region was considered as a partial match and the evaluation results were shown as “Lenient” in Section PVT case study, which meant that the extracted brain region might have been equal to or part of the annotated brain region.
We used the precision, recall, and F-measure metrics to evaluate our relation extraction approach. The automatically extracted neuroanatomical connectivity relations (i.e., pairs of brain region entities) are compared with the manually annotated (gold standard) pairwise neuroanatomical relations. Precision is defined as the proportion of correctly retrieved neuroanatomical relations (i.e., true positives) to all the relations that the application retrieves (i.e., sum of true positives and false positives), whereas recall is defined as the proportion of correctly retrieved neuroanatomical relations (i.e., true positives) to all the neuroanatomical relations in the gold standard annotation (i.e., sum of true positives and false negatives). F-Measure is the harmonic mean of the precision and recall values.
We used the WhiteText corpus in order to compare our results with the previous studies (French et al.,
<
<
The WhiteText corpus has been provided as two different data sets in time. In French et al. (
For the PVT case study, we have two different evaluation sets.
In the first evaluation, connectivity relation extraction results are given for 14 full text publications, which are manually annotated by domain experts. Rather than using the manually annotated gold standard brain region mentions as done during the evaluation over the WhiteText corpus, we used our dictionary for identifying the brain regions that participate in the connectivity relations.
In the second evaluation, to provide automated extraction results on the PVT corpus, which consists of 558 publications, we executed our application on the abstracts of the 451 publications (the full text of which are not publicly available) and 107 full text publications (which are publicly available). We further used the output of this evaluation on connectivity graph generation.
The accuracy of our approach for finding the directionality of the connectivity relations was computed by considering the true positive relations extracted by our system. Accuracy was computed by calculating the proportion of true positive relations with correctly identified directionality to all true positive relations retrieved by our system.
Table
Linguistically Motivated Approach—2nd dataset (1828 abstracts) | 76.94 | 14.59 | 24.53 |
(French et al., |
|||
Shallow Linguistic Kernel (SLK) | 51.00 | 67.00 | 57.92 |
Linguistically Motivated Approach—1st dataset (1377 abstracts) | 75.60 | 17.31 | 28.17 |
(French et al., |
|||
Shallow Linguistic Kernel (SLK) | 50.30 | 71.10 | 58.30 |
(Richardet et al., |
|||
Kernel (SLK) | 60.00 | 68.00 | 64.00 |
Ruta Rules | 72.00 | 12.00 | 21.00 |
Filter–Kernel | 66.00 | 19.00 | 29.00 |
Kernel–Rules | 81.00 | 10.00 | 18.00 |
Filter–Kernel–Rules | 82.00 | 7.00 | 12.00 |
The total number of true-interactions that were used as gold standard in the WhiteText corpus test set (2nd data set) was 1898. We extracted 360 relations by using our application and 277 of these relations were true positives, whereas we misinterpreted 83 of these relations. Overall, the precision on the test set was 76.94% with a recall level of 14.59%. The only previous study that used the same data set is the study of French et al. (
Additionally, Table
Using a knowledge-based approach came with more accurate results with the cost of missed relations when it is compared with the semi-automated or fully automated machine learning techniques. Therefore, comparing our approach with the Kernel results of French et al. (
A particular point of interest and a motivating factor in our undertaking the present study was due to a bottom-up view of depression proposed by one of us (Canbeyli,
For the evaluation of the PVT case study, we used the 14 manually annotated full texts, which were PVT specific publications.
As the output of the Relation Extraction phase (Section Neuroanatomical Relation Extraction), we generated the candidate relation pairs constructed of the agents and targets. The brain region dictionary we created (Section Creation of a Brain Region Dictionary) was used to validate the existence of brain region entities in the texts of the agents and targets. Therefore, the impact of a comprehensive dictionary was very high on the accuracy of the evaluation results.
The manually annotated data set of PVT from the 14 publications used in the present study contained 322 relations: 97 of these relations did not have any of our predefined patterns (Table
Using NLP techniques, our application extracted 161 relation candidates out of 225 “pattern-including” relations. When we compared each relation candidate with the annotated data set, the number of full matches was 107 and the number of partial matches was 15, whereas the number of incorrect predictions was 20. For the remaining 19 relation candidates, we evaluated the results in two different ways. These 19 candidates included the agents and the targets and were matching with the brain region entities in the brain region dictionary. This meant that we hit a relation with correct brain regions; therefore we evaluated these values as full or partial matches. We shared these results as NLP-based results in Table
Strict (Full Match) | 66.43 | 33.23 | 44.30 |
Lenient (Full Match + Partial Match) | 75.78 | 37.89 | 50.52 |
NLP-based | 87.58 | 43.79 | 58.39 |
The following sentence contained three of these 19 relations. Our application retrieved the relation candidates “PVT”-“PFC,” “PVT”-“NAS,” and “PVT”-“AMG” and they are likely to refer to a relation. However, these relations were considered either too generic or ambiguous, and therefore, have not been manually annotated in the data set.
“
Actually, this is one of the core points that we would like to highlight with automated relation extraction. Using different techniques, we can automatically extract brain region relations, but this is still an input for further evaluation and domain knowledge is crucial to turn this input to valuable information. We consider this NLP-based evaluation as also valuable and share it in addition to the Strict and Lenient results. Table
When we compared and evaluated the WhiteText and PVT corpora, we reached two conclusions. Firstly, recall value was higher with the PVT corpus, and the main reason was the percentage of the sentences that we could match with the patterns. For the WhiteText corpus, the maximum recall that we could reach was 57.7%, whereas for PVT annotated corpus it was 69.88%. Thus, the PVT corpus contained more relations aligned with the patterns.
Secondly, the precision values of the patterns were similar across the two data sets. Although the patterns were tuned based on the WhiteText corpus, they could effectively be applied to other data sets in this domain with precision levels of at least 70–75%.
Lastly, the total number of document-level unique relations was computed by eliminating the duplicate relations occurring in the same document so that a pair of brain regions was extracted only once from the document. Out of 322 relations in the 14 annotated papers, the total number of relations that are unique at document-level was 237. Only 7 of these 237 relations were in the abstract part of the publications, which meant that only 3% of the relations were available in the abstracts within this corpus. Using full text publications instead of abstracts mostly assured to obtain more relations to be extracted. A strength of our system is that it obtained the same success level on full text documents as well as on abstracts.
The PubMed IDs of the 14 annotated PVT papers and the annotated sentences are shared as Supplementary Materials. Considering that some of the publications are not publicly available, the publications are not fully provided.
We ran our application for the data set which consisted of 558 publications (451 abstracts and 107 full text publications) and 811 relations were extracted from this corpus including 343 different brain regions. Further analysis on the relations showed that PVT was the target of 75 relations, and the source of 92 relations. Table
PVT | 92 | 75 | 167 |
Locus coeruleus | 39 | 23 | 62 |
Nucleus accumbens | 8 | 47 | 55 |
Suprachiasmatic nucleus | 30 | 18 | 48 |
Amygdala | 10 | 29 | 39 |
PVT | Nucleus accumbens | 23 |
PVT | Prefrontal cortex | 13 |
Suprachiasmatic nucleus | PVT | 10 |
PVT | Amygdala | 8 |
PVT | Medial prefrontal cortex | 6 |
In Figure
While creating the graph, the agents and the targets were matched with the unique entities in the dictionary.
Various versions of the connectivity graph (with the arrows showing the direction or with different network analysis styles) are given as Supplementary Materials.
One of the contributions of our research to existing works was to define the direction of the relations.
As mentioned in the Relation Decision section, we defined a rule for each pattern that determines the direction of the relation. During the test phase of the WhiteText and PVT corpora, in addition to agent and targets we also added the directionality information as the output. The accuracy of the directionality prediction approach is shown in Table
1898 | 277 | 277 | 100.00 | |
PVT Corpus (14 annotated publications) | 322 | 122 | 119 | 97.54 |
From the second dataset of the WhiteText corpus, we obtained 277 true positive relations out of 360. In addition to extracting 277 relations correctly, the accuracy of the directions was calculated as 100%, which is also validated by one of the authors (RC)
For the annotated PVT corpus, 122 out of 322 relations were retrieved. When we evaluated the directionality of these relations, 119 out of 122 were predicted correctly which corresponds to an accuracy of 97.54%. As shown in Table
A major aim of the present study was to provide a new approach in text mining to chart out neuroanatomical connections of a specific brain structure. We have presented a linguistically motivated approach to extract neuroanatomical connectivity relations from scientific publications by using NLP techniques. Our approach leverages the constituency and dependency parse trees of sentences and defines the agents and the targets by also providing the directionality of the relations.
The strength of our approach comes from the patterns and rules that are defined over the parse trees of the sentences. The selection criteria for the patterns heavily depend on the individual success of each pattern to lead to a relation. We use the patterns to identify candidate sentences for further processing and relation extraction. A limitation of our approach is that only relations from sentences that match one of our predefined patterns can be extracted. On the other hand, whenever a pattern is found in a sentence, it is very likely that a relation extracted after further processing is correct. Therefore, our expectation from the present study was to obtain high precision and low recall values. We preferred to have a target of at least 60% precision level for each pattern, and as a consequence, the maximum recall value that our application could reach was approximately 70% (on the PVT data set). It is up to the researchers to define the optimum level for their evaluations. In this study, our goal was to design a high precision system so that many false positive relations are not included in the brain region connectivity graph, which could lead to incorrect interpretations.
Most previous studies on connectivity extraction among brain regions from text used machine learning based methods originally proposed for extracting protein-protein interactions. French et al. (
Additionally, by using the predefined patterns to find the agent and the target, we were able to make a contribution on a missing feature of prior work on relation extraction: directionality of the relation. According to the grammatical structure of the sentences and the pattern usages, we identified the relation directionality between the brain regions and the overall accuracy of extracted directions was more than 97%.
In the PVT case study, we used a dictionary-based approach while extracting the brain regions from publications. It is known that in the neuroscience literature brain region entities are not used in a unique and standardized way. There are several different names of each brain region and the corresponding abbreviations may vary. Using brain region mentions directly without normalizing them to canonical brain region names would result in redundant entities (nodes) that referred to the same brain region in the connectivity graph. By using a dictionary, we accepted the possible loss on finding all the brain regions from the texts, but on the other hand we leveraged the dictionary usage on the connectivity graph by providing canonical names for the brain regions.
A decision point for us was whether to use the existing ontologies or to create our own dictionary. Before constructing the dictionary, we investigated the existing brain ontologies. Brain Architecture Management System (BAMS) (Bota and Swanson,
During the relation extraction phase, we faced several difficulties. One was related to the WhiteText corpus. This manually annotated corpus was considered as gold standard for the first evaluation phase of our research. Since this corpus is enhanced with the abbreviation expansion algorithm, we also needed to use the same approach. Schwartz and Hearst Abbreviation Expansion Algorithm (Schwartz and Hearst,
An additional aim of the present study was to provide by means of a connectivity graph an overview of the neuroanatomical connectivity relations of PVT that may suggest potentially new functions for the midline thalamic structure. As demonstrated in Figure
In the light of the vast connectivity uncovered by our present study, we hope that there may be more interest in delineating neuroanatomical subcircuits involving the PVT as potential substrates for various functions. Toward that goal, we hereby propose in outline form a PVT circuitry that we hope to elucidate in a future article that may be underlying a mood modulatory mechanism. Briefly, our analysis of PVT connections has uncovered a strong connectivity between the PVT and several structures known to be involved in mood and depression in both humans and animals. As demonstrated in Tables
In our study, we have focused on automated connectivity relation extraction of brain regions in the neuroscience domain. Hence, our defined patterns and rules might not be generic enough to be used in other domains such as Protein-Protein and Gene-Disease interactions. This is considered as a possible future work. Similarly to previous studies on brain region connectivity extraction (French et al.,
EG, carried out the computational studies, performed the implementations of the algorithms, and participated in the design of the study, analysis of the results, and drafting of the manuscript. AÖ, participated in the design of the study, analysis of the results, and drafting of the manuscript. RC, participated in the design of the study, analysis of the results, annotation of the data and drafting of the manuscript. All authors read and approved the manuscript.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors thank Leon French and his co-workers for creating and sharing the manually annotated WhiteText corpus. We would also like to thank Dr. Pınar Öz for her support during the pattern definition/selection phases and sharing her knowledge on PVT. This work has been partially supported by Marie Curie FP7-Reintegration-Grants within the 7th European Community Framework Programme.
The Supplementary Material for this article can be found online at:
1
2Supplementary Materials can be found at:
3Stanford Parser Dependencies Manual can be found at:
4The manually annotated directions for the true positive relations extracted from the WhiteText and PVT corpora are available as supplementary files as well as in Github repository:
5