Detailed Analysis of Extrinsic Plagiarism Detection System Using Machine Learning Approach (Naive Bayes and SVM)
Abstract
In this report we proposed a detailed analysis method of plagiarism detection system using machine learning approach. We used Naive Bayes and Support Vector Machine (SVM) as learning algorithms. Learning features used in the method are words similarity, fingerprints similarity, latent semantic analysis (LSA) similarity, and word pair. The purpose in selecting those features is to retrieve information from the state-of-the-art detailed analysis methods (words similarity, fingerprinting, and LSA) in order to integrate the strength of each method in detecting plagiarism. Several experiments were conducted to test the performance of the proposed method in detecting many cases of plagiarism. The experiments used data test that contains cases of literal plagiarism, partial literal plagiarism, paraphrased plagiarism, plagiarism with changed sentence structure, and translated plagiarism. The data test also contains cases of non-plagiarism of different topics and non-plagiarism of the same topic. The results obtained in experiments using SVM showed an average accuracy of 92.86% (reaching 95.71% without using words similarity feature). While the result obtained using Naive Bayes showed an average accuracy of 54.29% (reaching 84.29% without using the word pair features).
Keywords
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v12.i11.pp7884-7894
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).