Learning Chomsky-like grammars for biological sequence families
Muggleton, SH, Bryant, CH and Srinivasan, A 2000, 'Learning Chomsky-like grammars for biological sequence families' , in: Proceedings of the17th International Conference on Machine Learning , Morgan Kaufmann, San Francisco, CA, pp. 631-638.
| PDF Download (217kB) | Preview |
Abstract
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). As far as these authors are aware, this is both the first biological grammar learnt using ILP and the first real-world scientific application of the positive-only learning framework of CProgol. Performance is measured using both predictive accuracy and a new cost function, em Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a filter is more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. The highest RA was achieved by a model which includes grammar-derived features. This RA is significantly higher than the best RA achieved without the use of the grammar-derived features.
| Item Type: | Book Section |
|---|---|
| Editors: | Langley, P |
| Themes: | Subjects / Themes > Q Science > QA Mathematics > QA075 Electronic computers. Computer science Subjects / Themes > Q Science > QH Natural history > QH301 Biology Subjects outside of the University Themes |
| Schools: | Colleges and Schools > College of Science & Technology Colleges and Schools > College of Science & Technology > School of Computing, Science and Engineering Colleges and Schools > College of Science & Technology > School of Computing, Science and Engineering > Data Mining and Pattern Recognition Research Centre |
| Publisher: | Morgan Kaufmann |
| Refereed: | Yes |
| ISBN: | 1-55860-707-2 |
| Related URLs: | |
| Depositing User: | Dr Chris H. Bryant |
| Date Deposited: | 16 Feb 2009 16:05 |
| Last Modified: | 27 Sep 2011 12:32 |
| URI: | http://usir.salford.ac.uk/id/eprint/1763 |
Document Downloads
More statistics for this item...Actions (login required)
| Edit record (repository staff only) |

Tools
Tools