Predicting likelihood of fraud among financial distressed firms in Malaysia using textual analysis

Marziana Madah Marzuki, Syerina Azlin Md Nasir, Siti Fadilah Mat Zain, Nik Siti Madihah Nik Mangsor


This research paper aims to analyze and predict fraud patterns among failed companies in Malaysia. The approach involves utilizing textual analysis on the management discussion and analysis (MD&A) section within the annual reports. The dataset is subjected to text clustering to group companies based on similar financial characteristics. This clustering process entails several steps, including data conversion, collation, and summarization into a structured format, followed by text pre-processing to cleanse the dataset. Notably, RapidMiner Studio software was utilized to extract data for the study. Subsequently, the documents are clustered using both the K-means and latent dirichlet allocation (LDA) methods. Upon examining a sample of 22 failed companies in the year 2020, the study reveals that financially distressed companies exhibit prominent financial negativity and utilize litigious financial terms within their MD&A sections. These linguistic traits are found to be closely associated with seven distinct characteristics of fraudulent firms. This preliminary findings provide compelling evidence that financial pressure may serve as a triggering factor for fraudulent activities within companies.


Annual report; Financial distressed firms; Fraud; Text clustering; Topic modeling

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics