A relational background knowledge boosting based topic model for Chinese poems
Abstract
Classical Chinese poetry has been increasingly popular in recent years, and modeling its topic is quite a promising area of research. Chinese poems have the characteristic of short in length, but traditional topic models perform poorly when faced with short texts due to the text sparsity. Therefore, topic model should be improved to satisfy the scenario of classical Chinese poems. In this paper, a relational background knowledge boosting based topic model (RBKBTM) was proposed to overcome the text sparsity of Chinese poems. We incorporated background information into the model, which expanded the text content from the semantic perspective. The background knowledge was combined using word embedding and TextRank and was then fed into the core computing process. Subsequently, a new sampling formula was derived. Our proposed model was tested on three different tasks using three different datasets. The results demonstrate that the incorporated background knowledge can effectively overcomes text sparsity, improving the performance and effectiveness of the topic model.
Keywords
Gibbs sampling; Latent dirichlet allocation; Short text; TextRank; Topic model
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v35.i2.pp1227-1243
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).