Skip to the content

A new approach to conceptual document indexing

Barresi, S, Nefti-Meziani, S and Rezgui, Y 2007, A new approach to conceptual document indexing , in: Eighth International Conference on Intelligent Text Processing and Computational Linguistics, Special Session (CICLing 2007), 18-24 February 2007, Mexico City, Mexico.

Full text not available from this repository. (Request a copy)

Abstract

This paper presents a new conceptual indexing technique intended to overcome the major problems resulting from the use of Term Frequency (TF) based approaches. To resolve the semantic problems related to TF approaches, the proposed technique disambiguates the words contained in a document and creates a list of super ordinates based on an external knowledge source. In order to reduce the dimension of the document vector, the final set of index values is created by extracting a set of common concepts, shared by multiple related words, from the list of hypernyms. Subsequently, a weight is assigned to each concept index by considering its position in the knowledge source's hierarchical tree (i.e. distance from the substituted words) and its number of occurrences. By applying the proposed technique, we were able to disambiguate words within different contexts, extrapolate concepts from documents, assigning appropriate normalised weights, and significantly reduce the vector dimension.

Item Type: Conference or Workshop Item (Paper)
Themes: Subjects / Themes > Q Science > QA Mathematics > QA075 Electronic computers. Computer science
Subjects outside of the University Themes
Schools: Colleges and Schools > College of Science & Technology
Colleges and Schools > College of Science & Technology > School of Computing, Science and Engineering
Colleges and Schools > College of Science & Technology > School of Computing, Science and Engineering > CASE Control & Systems Engineering Research Centre
Colleges and Schools > College of Science & Technology > School of the Built Environment > Centre for Information Technology in Construction
Journal or Publication Title: Computer Science (ENC), 2009 Mexican International Conference
Publisher: IEEE Xplore
Refereed: Yes
Related URLs:
Funders: Conference Publishing Services of IEEE Computer Society
Depositing User: H Kenna
Date Deposited: 08 Jan 2009 17:58
Last Modified: 19 Jan 2014 22:45
URI: http://usir.salford.ac.uk/id/eprint/971

Actions (login required)

Edit record (repository staff only)

No Altmetrics available