Mater.Phys.Mech.(MPM)
No 3, Vol. 9, 2010, pages 246-250

DOCUMENT CLASSIFICATION USING WEIGHTED ONTOLOGY

Asta Bevainytė and Linas Būtėnas

Abstract

The main task of this paper is to present the document comparison and classification model based on weighed ontology. For this reason we created classification system based on 3 algorithms. The tests have been performed to measure several aspects of the system: i) the quality of comparison of documents in Lithuanian language; ii) the optimal size of ontology; iii) the type of part of speech words used to create ontology; The final results indicate 96% of correct classification cases and suggest that all the main part of speech terms should be used from the text. The proposed document classification model can be used to search and classify Lithuanian language texts more efficiently than keyword based systems.

full paper (pdf, 912 Kb)