Redho Aidil, Iqrom and Tri Basuki, Kurniawan (2023) Multi-Label Text Classification for Indonesian Language IT Journal with K-Nearest Neighbors (KNN). Journal of Data Science, 2023 (05). pp. 1-9. ISSN 2805-5160
Text
jods2023_05.pdf - Published Version Available under License Creative Commons Attribution. Download (333kB) |
Abstract
Classification is the process of finding a model or function that explains or distinguishes concepts or data classes, intending to estimate the category of an object whose label is unknown, and various types of classification, one of which is the classification of text documents. Document text classification based on label category is one of the mandatory components in the retrieval system to provide better and more accurate information. Based on existing research, only single-label Classification of text documents is carried out, and it is infrequent for multi-label Classification of IT journals, especially in the Indonesian language. Therefore, this research is aimed at multi-label text classification using the K-Nearest Neighbors (KNN) method, and the OnevsRest Classifier approach model, where the classification process will be determined by the closest k = n value in the category of documents that are similar and the multi-labels are in prediction with One vs. Rest Classifier. Training and testing are done with a dataset of 500 Indonesian IT journals. The test results are sufficient to give good results with an accuracy of 84% and a hamming loss of 0.076.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Classification, k-nearest neighbours, one vs. rest classifier, single label, multi-label |
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Depositing User: | Unnamed user with email masilah.mansor@newinti.edu.my |
Date Deposited: | 18 Aug 2023 08:55 |
Last Modified: | 18 Aug 2023 08:55 |
URI: | http://eprints.intimal.edu.my/id/eprint/1779 |
Actions (login required)
View Item |