Emotion recognition from Burmese speech based on fused features and deep learning method
Abstract
Burmese language is challenging for speech emotion classification. Moreover, it is lack of resource and few research was made in this topic. To solve the challenging problem, novel feature extraction for Burmese language is proposed. For lack of resource, Burmese speech emotion corpus called BMISEC is built. To support the challenging problem, the advantages of feature extractions are fused to create a robust feature. Four features are fused. Novel text-tone feature, local binary pattern, mel-frequency cepstral coefficient and discrete wavelet transform are fused. To progress the performance, deep learning method called DenseNet-Emotion is used for classification. Support vector machine is used in DenseNet’s classifier layer. To show the robustness of the proposed system, three types of experiments are made on Tensorflow framework. They are ablation study, experiments with three publicly available datasets and experiments with the previous research methods and they are compared with the proposed method. It is found that feature fusion is superior to only one feature in emotion recognition. BMISEC gets better performance than other datasets. Moreover, the proposed method gets the superior result than previous research methods. The proposed method gets the accuracy of 88.388% for 50 epochs.
Keywords
DenseNet-emotion; Discrete wavelet transform; Local binary pattern; Mel-frequency cepstral coefficient; Text-tone feature
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v35.i2.pp888-897
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).