Information-Theoretic Evaluation for Computational Biomedical Ontologies

Free download. Book file PDF easily for everyone and every device. You can download and read online Information-Theoretic Evaluation for Computational Biomedical Ontologies file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Information-Theoretic Evaluation for Computational Biomedical Ontologies book. Happy reading Information-Theoretic Evaluation for Computational Biomedical Ontologies Bookeveryone. Download file Free Book PDF Information-Theoretic Evaluation for Computational Biomedical Ontologies at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Information-Theoretic Evaluation for Computational Biomedical Ontologies Pocket Guide.

The first term cMP represents navigation across the edges of the graph, and the second term 1- c v represents the probability of jumping to any vertex. The damping factor weights the combination of these terms; we used the default damping factor of 0. In the personalized PageRank PPR algorithm, probability mass is concentrated on a set of entries in the vector v , biasing the jumps towards certain vertices [ 48 ]. The relatedness of a pair of concepts is defined as the cosine of the angle between their PPR vectors.

For each knowledge source, we constructed a taxonomy for use with semantic similarity measures; computed the depth and intrinsic IC of each concept; and implemented semantic similarity measures. To evaluate the PPR, we constructed two types of undirected concept graphs: graphs that used only taxonomic relationships PPR-taxonomy , and graphs that used all relationships from the respective knowledge sources PPR-all.

One major advantage of the PPR method is its ability to leverage non-taxonomic relationships to compute concept relatedness. Evaluating the PPR on both types of concept graphs allows us to quantify the contribution of non-taxonomic relationships to the computation of concept relatedness. Refer to Additional file 1 : Appendix 1 for a detailed description of concept graph construction. We evaluated measures on the Pedersen, Mayo, and UMN similarity and relatedness benchmarks [ 13 , 19 , 41 ].

The baseline contains frequencies from over 20 million citations [ 49 ]. Several concepts from the Pedersen benchmark are missing from the MeSH vocabulary. Refer to Additional file 1 : Appendix 1 for a detailed listing of term to concept mappings used for these benchmarks. The path finding LCH and Path are monotonically decreasing functions of the shortest path between concepts and therefore produce the same relative ranks and are thus identical for the purposes of evaluating their correlation. We applied the Fisher r-to-z transformation to test the significance of difference in correlation between different measures and concept graphs, and to compare our results to previously published results obtained using distributional measures.

The null hypothesis is that there is no significant difference in correlation between different measures. The probability of rejecting the null hypothesis when it is in fact false — the statistical power — is higher on the larger UMN benchmarks.

Semantic Similarity in Biomedical Ontologies

We used R version 2. We released all code and scripts required to reproduce our results as open source. The taxonomies include only those concepts that partake in taxonomic relationships. We also include the results of previous evaluations, where they are comparable. Refer to Additional file 2 : Appendix 2 for a listing of the significance of differences between all pairs of measures, and between concept graphs. In general, intrinsic IC based measures outperformed path finding measures. On the larger Mayo and UMN reference standards, intrinsic IC based measures significantly outperformed path finding measures.

The intrinsic IC based LCH measure achieved the best performance, but the improvement relative to other intrinsic IC based measures was usually not statistically significant see Additional file 2 : Appendix 2 for p-values. In general, the PPR measure computed with all relationships PPR — all achieved poor performance, and was significantly outperformed by path and intrinsic IC based measures.

Ontology (information science)

The PPR measure computed with taxonomic relationships PPR — taxonomy significantly outperformed path-based measures on the Mayo reference standard, and significantly outperformed intrinsic IC based measures on the Pedersen coders reference standard. On larger reference standards, this relationship was reversed: intrinsic IC based measures significantly outperformed PPR-taxonomy based measures on the Mayo and UMN datasets.

Increasing the concept graph size improved the performance of both path finding and intrinsic IC based measures: measure performance increased with the size of the concept graph. Pakhomov et al. The context vector utilized a co-occurrence matrix derived from , EMR inpatient reports and had correlations of 0. Liu et al. There was no significant difference in correlation between the intrinsic and corpus IC based measures on any reference standard. The system we developed is open source and written in the platform independent Java language.

The system allows the declarative definition of concept graphs or taxonomies and stores these graphs in a binary format.

For taxonomies, it computes the depth and intrinsic information content of each node. We provide a publicly available web service to compute semantic similarity measures. Notable aspects of our system include the ability to compute both intrinsic IC and corpus IC based measures, and the ability to compute similarity measures from a wide range of biomedical knowledge sources. The time and computational resources needed to generate concept graphs varies based on size. Computing the intrinsic information content is the most computationally and memory intensive step in preparing a taxonomy.

Once created, the concept graph can be loaded and used to compute similarity measures. The time and resources needed to load the concept graph depends on its size; loading the taxonomy for the entire UMLS required 30 seconds and 1 GB of memory. All computations were performed on a bit Ubuntu 10 Linux workstation with dual quad-core 3.

The computation of relatedness via the personalized PageRank algorithm is computationally intensive, and increases with concept graph size.

Citations per year

Intrinsic IC based measures in general outperformed path based measures; in some cases, these differences were significant. Intrinsic IC and path based measures compute similarity as a function of the distance between concepts in a taxonomy. IC based measures achieve higher performance than path based measures by weighting taxonomic links based on concept specificity. The personalized PageRank algorithm achieved state of the art performance on general language semantic similarity tasks, but did not outperform simpler knowledge based methods on these benchmarks.

Furthermore, PPR is orders of magnitude more computationally intensive than simpler semantic similarity measures, and may be impractical for some applications. In contrast to other knowledge based similarity measures, PPR can utilize non-taxonomic relationships to compute concept relatedness. The UMLS contains many types of non-taxonomic relationships. Our results suggest that knowledge based measures can outperform distributional measures. Knowledge based measures are also more practical than distributional measures, as they do not require a corpus from which word co-occurrence or concept frequencies must be estimated.

  • Information-Theoretic Evaluation for Computational Biomedical Ontologies |
  • Altmetric – Information-theoretic evaluation of predicted ontological annotations.
  • Education, Exclusion and Citizenship;

Knowledge based measures significantly and meaningfully outperformed distributional vector based measures on the larger UMN benchmarks. One limitation to our study is that we compared knowledge based methods to previously published distributional vector based measures: we cannot exclude the possibility that differences in the UMLS version used may have biased results. However, our reasons for not implementing context vector measures represent exactly their limitations: a large clinical corpus is not available to us; it is not clear if publicly available corpora such as MEDLINE abstracts are suitable for this purpose; and the processing of large corpora is computationally intensive.

Community-Wide Evaluation of Computational Function Prediction

Distributional vector based measures in the biomedical domain may suffer from imbalance and sparseness due to limited corpus sizes [ 23 , 33 ]. Use of a larger clinical corpus may rectify these issues, and improve the performance of vector based measures relative to knowledge based measures.

Even if performance could be improved with a large corpus, it is not clear what practical consequences this would have, as many applications of semantic similarity measures lack access to large clinical corpora. Our evaluation showed no significant differences between corpus IC and intrinsic IC based measures. These results suggest that, given the ease with which IC can be estimated from a taxonomy, intrinsic IC based measures are a practical alternative to corpus IC based measures.

One limitation of our study is that we only evaluated corpus IC based measures with MeSH using concept frequencies estimated from a biomedical corpus. However, for many applications, computing corpus IC may not be practical: in addition to the lack of availability of large clinical corpora, the estimation of concept frequencies requires an annotated corpus. Automated concept annotation methods may be confounded by textual ambiguity, and manual concept annotation may be impractical for large corpora [ 14 ]. Strengths of our study include the evaluation of a wide range of measures using multiple benchmarks and knowledge sources, and the assessment of the statistical significance of differences between measures and across knowledge sources.

On the smaller Pedersen physicians benchmark, distributional vector based measures significantly outperformed knowledge based measures. In contrast, on the larger UMN benchmark, intrinsic IC based measures significantly outperformed path finding and distributional vector based measures. These findings suggest that future evaluations of semantic similarity and relatedness measures in the biomedical domain should utilize larger benchmarks to ensure the reliability of results. We are currently evaluating semantic similarity measures on word sense disambiguation and document classification tasks.

See a Problem?

We evaluated knowledge based semantic similarity measures using different biomedical knowledge sources, and we compared the accuracy of these measures against benchmarks of semantic similarity and relatedness. We found that intrinsic IC based measures achieved the best performance across a wide range of benchmarks and knowledge sources; intrinsic IC based measures performed as well or better than distributional measures; and that measures based on the UMLS achieve significantly higher accuracy than those based on smaller knowledge sources such as MeSH or SNOMED CT.

In Handbook on Ontologies. International Handbooks on Information Systems. Edited by: Staab S, Studer R. Berlin Heidelberg: Springer; — In Proceedings of the 29th European conference on IR research.

Rome, Italy: Springer; — Boulder, Colorado: Association for Computational Linguistics; — Aseervatham S, Bennani Y: Semi-structured document categorization with a semantic kernel. Pattern Recogn , — J Biomed Inform , — Special Issue of Multimedia Semantics. Sahami M, Heilman TD: A web-based kernel function for measuring the similarity of short text snippets. Edited by: Gelbukh A. Heidelberg: Springer Berlin; — In Proceedings of Human Language Technologies.

J Biomed Inform , 77— J Biomed Inform , 44 1 — LREC Computational Linguistics , 13— Morgan Kaufmann Publishers Inc; — Patwardhan S: Using WordNet-based context vectors to estimate the semantic relatedness of concepts. Proceedings of the EACL , 1—8. Lesk M: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone.

Proceedings of the 5th annual international conference on Systems documentation. Lin D: Automatic retrieval and clustering of similar words. In Proceedings of the 17th international conference on Computational linguistics - Volume 2. J Am Med Inform Assoc , ee J Am Med Inform Assoc , — J Biomed Inform , 44 2 — Wu Z, Palmer M: Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics.

NLTK Toolkit. Bioinformatics , — Apache UIMA.