Classification of voice pathologies using one dimensional feature vector and two dimensional scalogram
Abstract
Most research work focus only on binary classification of voice pathologies such as normal and pathological classification. However, the current work gives importance to multiclass classification too. The paper compares onedimensional (1D) feature vectors based machine learning (ML) techniques and two-dimensional (2D) scalogram image based deep learning (DL) model for binary and multiclass classification of voice pathology. The multiclass classification classifies the voice signal into four categories which are healthy, hyperkinetic dysphonia, hypokinetic dysphonia, and reflux laryngitis. The current work demonstrates the evaluation of 1D feature vectors extracted from speech signal such as MFCC (mel-frequency cepstral coefficient) and pitch with various ML techniques like K-nearest neighbor (KNN), Naïve Bayes, and discriminant analysis (DA). Another technique that uses time-frequency scalograms derived using three different wavelets, i.e., analytical Morlet (amor), Bump, and Morse, are used for training a pretrained GoogleNet architecture, which is a very popular DL model. Experimental results show that 2D scalogram image based DL model for binary (96.05%) and multiclass (89.8%) classification of voice pathology gives better performance while comparing with 1D feature vectors based ML techniques.
Keywords
Analytical Morlet (amor); Bump; Discriminant analysis; K-nearest neighbor; Morse; Naïve Bayes
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v40.i2.pp654-666
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).