Text mining of Journal of Comparative Neurology

Supplement to Using text mining to link journal articles to neuroanatomical databases.
French L, Pavlidis P pubmed
The journal of comparative neurology 2012 Jun 1;520(8):1772-83

Abstract

The electronic linking of neuroscience information, including data embedded in the primary literature, would permit powerful queries and analyses driven by structured databases. This task would be facilitated by automated procedures which can identify biological concepts in journals. Here we apply an approach for automatically mapping formal identifiers of neuroanatomical regions to text found in journal abstracts, and apply it to a large body of abstracts from the Journal of Comparative Neurology. The analyses yield over one hundred thousand brain region mentions which we map to 8,225 brain region concepts in multiple organisms. Based on the analysis of a manually annotated corpus, we estimate mentions are mapped at 95% precision and 63% recall. Our results provide insights into the patterns of publication on brain regions and species of study in the Journal, but also point to important challenges in the standardization of neuroanatomical nomenclatures. We find that many terms in the formal terminologies never appear in a JCN abstract, while conversely, many terms authors use are not reflected in the terminologies. To improve the terminologies we deposited 136 unrecognized brain regions into the Neuroscience Lexicon (NeuroLex). The training corpus, lexicons, normalizations, evaluations and annotated journal abstracts are freely available at http://www.chibi.ubc.ca/WhiteText/.

 

Supplemental Materials