Muggleton, SH, Bryant, CH ORCID: https://orcid.org/0000-0002-9002-8343 and Srinivasan, A
2000,
'Measuring performance when positives are rare: relative advantage versus predictive accuracy - a biological case-study'
, in:
Machine learning: ECML 2000: 11th European conference on machine learning, Barcelona, Catalonia, Spain, May 31-June 2 2000
, Lecture notes in computer science
(1810)
, Springer, Berlin / Heidelberg, Germany, pp. 300-312.
![]()
|
PDF
Download (216kB) | Preview |
Abstract
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Performance is measured using both predictive accuracy and a new cost function, em Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.
Item Type: | Book Section |
---|---|
Editors: | de Mántaras, RL and Plaza, E |
Additional Information: | Paper originally presented at the 11th European Conference on Machine Learning Barcelona, Catalonia, Spain, May 31 – June 2, 2000 Proceedings. |
Themes: | Subjects / Themes > Q Science > QA Mathematics > QA075 Electronic computers. Computer science Subjects / Themes > Q Science > QH Natural history > QH301 Biology Subjects outside of the University Themes |
Schools: | Schools > School of Computing, Science and Engineering Schools > School of Computing, Science and Engineering > Salford Innovation Research Centre |
Publisher: | Springer |
Refereed: | Yes |
Series Name: | Lecture notes in computer science |
ISBN: | 9783540676027 |
Related URLs: | |
Depositing User: | Dr Chris H. Bryant |
Date Deposited: | 17 Feb 2009 12:16 |
Last Modified: | 16 Feb 2022 08:16 |
URI: | https://usir.salford.ac.uk/id/eprint/1764 |
Actions (login required)
![]() |
Edit record (repository staff only) |