An empirical evaluation of phrase-based statistical machine translation for Indonesia slang-word translator

Kyrie Cettyara Eleison, Sari Uli Inggrid Hutahaean, Sarah Christine Tampubolon, Teamsar Muliadi Panggabean, Ike Fitriyaningsih

Abstract


The use of slang (non-standard language), especially in social media, is increasing. It causes reducing the level of understanding when communicating because not everyone understands slang (non-standard language). The purpose of this work is to develop a slang-word translator. The other objective is to find the minimum number of sentences and BiLingual Evaluation Understudy (BLEU) score used as a benchmark to determine that the translation is understandable. The approach used in this project is a Phrase-based statistical machine translation (PBSMT) approach, suitable for low resource language, with a dataset of 100,000 sentences taken from the comments column of several online political news portals. The comments are then manually translated to produce a parallel corpus of non-standard language-standard language. The sample sentences are taken from the dataset then distributed using questionnaires to obtain the human understanding level regarding the translation result. The result of the implementation is a BLEU score of 64 and the minimum number of sentences to have an understandable machine translation is 500. The conclusion drawn from the distributed questionnaires is that humans can understand the sentences produced by the translation machine.

Keywords


BLEU score; Language; Machine translation; Phrase-based statistical machine translation; Questionnaire

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v25.i3.pp1803-1813

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics