ViHateT5 with LoRA: efficient vietnamese toxic news classification on social media

Tran Duc Duong, Hai Hoan Do

Abstract


We propose an efficient transformer-based approach to detect toxic or misleading news in Vietnamese social media. Motivated by the societal harm of viral misinformation in Vietnam, we fine-tune a Vietnamese T5 model (ViHateT5) on a new dataset of 2,962 social-media news snippets labeled as toxic vs. non-toxic. We use low-rank adaptation (LoRA) to inject trainable layers into ViHateT5, allowing high accuracy with minimal additional parameters. Our model achieves 97.5% macro-F1 on a held-out test set, significantly higher than a PhoBERT baseline by 2.7 points. By focusing on Vietnamese data and a parameter-efficient method, we demonstrate a practical pipeline for low-resource fake-news detection. These results suggest that transformer pretraining on social-media text can effectively capture the subtle cues of deceptive or defamatory news. Limitations: the current model is trained on a specific labeled dataset and may not generalize to all domains; future work should evaluate its fairness and biases in deployment.

Keywords


LoRA finetuning; Natural language processing; Social media classification; Toxic news detection; Transformer models

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v42.i1.pp123-130

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

shopify stats IJEECS visitor statistics