The effect of word sense disambiguation accuracy on literature based discovery

Preiss, J ORCID: https://orcid.org/0000-0002-2158-5832 and Stevenson, M 2016, 'The effect of word sense disambiguation accuracy on literature based discovery' , BMC Medical Informatics and Decision Making, 16 (Sup. 1) , p. 57.

[img]
Preview
PDF - Published Version
Available under License Creative Commons Attribution 4.0.

Download (311kB) | Preview

Abstract

Background The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed connections between published information (also known as “hidden knowledge”). A common approach is to identify hidden knowledge via shared linking terms. However, biomedical documents are highly ambiguous which can lead LBD systems to over generate hidden knowledge by hypothesising connections through different meanings of linking terms. Word Sense Disambiguation (WSD) aims to resolve ambiguities in text by identifying the meaning of ambiguous terms. This study explores the effect of WSD accuracy on LBD performance. Methods An existing LBD system is employed and four approaches to WSD of biomedical documents integrated with it. The accuracy of each WSD approach is determined by comparing its output against a standard benchmark. Evaluation of the LBD output is carried out using timeslicing approach, where hidden knowledge is generated from articles published prior to a certain cutoff date and a gold standard extracted from publications after the cutoff date. Results WSD accuracy varies depending on the approach used. The connection between the performance of the LBD and WSD systems are analysed to reveal a correlation between WSD accuracy and LBD performance. Conclusion This study reveals that LBD performance is sensitive to WSD accuracy. It is therefore concluded that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated. It is also suggested that further improvements in WSD accuracy have the potential to improve LBD accuracy.

Item Type: Article
Schools: Schools > School of Computing, Science and Engineering
Journal or Publication Title: BMC Medical Informatics and Decision Making
Publisher: BioMed Central
ISSN: 1472-6947
Related URLs:
Funders: Engineering and Physical Sciences Research Council (EPSRC)
Depositing User: J Preiss
Date Deposited: 11 Nov 2020 08:29
Last Modified: 11 Nov 2020 08:45
URI: http://usir.salford.ac.uk/id/eprint/58771

Actions (login required)

Edit record (repository staff only) Edit record (repository staff only)

Downloads

Downloads per month over past year