Mouaz Bezoui


Arabic dialects differ substantially from MSA and each other in terms of phonology, morphology, lexical choice and syntax. In this paper, we describe a speech recognition system that automatically identifies the gender of speaker, the Emphatic Letter pronounced and also the diacritic of this emphatic letters given a sample of his/her speech. Firstly we examine the performance of the single case classifier HMM, applied in the samples of our data corpus. Then we try to evaluate the proposed approach KNN-DT, this is a speech recognition hybridization of two classifiers Decision Tree (DT) and k-nearest neighbor (KNN). Both models are singularly first applied directly on the data corpus to recognize the emphatic letter of the sound, the diacritic and the gender of each speaker.


Automatic Speech Recognition; Machine Learning; Natural Language Processing


S. Ouni, M. Cohen, W. Massaro, “Training Baldi to be Multilingual: A Case Study for an Arabic Badr”, SpeechCommunication, Vol. 45, pp.115-37, 2005.

Husni Al-Muhtaseb, Mustafa Elshafei and Mansour Alghamdi “Techniques for High Quality Arabic Text-tospeech”, The Third Workshop on Computer and Information Sciences, pp. 73-83, Dammam 2000.

S. Selouani, J. Caelen, “Arabic Phonetic Features Recognition using Modular Connectionist Architectures”, Interactive Voice Technology for Communication, IVTTA’98, Proceedings 1998 IEEE 4th Workshop 29, 30 Sept. 1998, pp. 155-160, Torino, Italy.

Maamouri, M., Bies, A., & Kulick, S. (2006). Diacritization: A challenge to Arabic treebank annotation and parsing. In Proceedings of the Conference of the Machine Translation SIG of the British Computer Society.

Yaseen, M., Attia, M., Maegaard, B., Choukri, K., Paulsson, N., Haamid, S., ... & Haddad, B. (2006, May). Building Annotated Written and Spoken Arabic LRs in NEMLAR Project. In LREC (pp. 533-538).

Masmoudi, S., Frikha, M., Chtourou, M., & Hamida, A. B. (2011). Efficient MLP constructive training algorithm using a neuron recruiting approach for isolated word recognition system. International Journal of Speech Technology, 14(1), 1-10.

Dhanashri, D., & Dhonde, S. B. (2017). Isolated word speech recognition system using deep neural networks. In Proceedings of the international conference on data engineering and communication technology (pp. 9-17). Springer, Singapore.

XU, Bing, WANG, Naiyan, CHEN, Tianqi, et al. Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853, 2015.

Khelifa, M. O., Elhadj, Y. M., Abdellah, Y., & Belkasmi, M. (2017). Constructing accurate and robust HMM/GMM models for an Arabic speech recognition system. International Journal of Speech Technology, 20(4), 937-949.

Rabiner, L. R., Wilpon, J. G., & Soong, F. K. (1989). High performance connected digit recognition using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(8), 1214-1225.

Zhang, X., Sun, J., & Luo, Z. (2014). One-against-all weighted dynamic time warping for language-independent and speakerdependent speech recognition in adverse conditions. PLoS ONE, 9(2), e85458. https ://doi.org/10.1371/journ al.pone.00854 58.

Hazmoune, S., Bougamouza, F., Mazouzi, S., & Benmohammed, M. (2013a). A novel speech recognition approach based on multiple modeling by hidden Markov models. In International Conference on Computer Applications Technology (ICCAT), 2013 (pp. 1–6). Sousse: IEEE.

Zhang, X., Povey, D., & Khudanpur, S. (2015). A diversity-penalizing training method for deep learning. In INTERSPEECH (pp. 3590–3594).

Bezoui Mouaz, Beni-hssane Abderrahim, Elmoutaouakkil Abdelmajid, Speech Recognition of Moroccan Dialect Using Hidden Markov Models. IAES International Journal of Artificial Intelligence (IJ-AI) Vol. 8, No. 1, March 2019, DOI: 10.11591/ijai.v8.i1.pp7-13.

Rabiner L-R., Juang B-H., Fundamentals of Speech Recognition, Prentice-Hall, 1993.

J.R Quinlan, Simplifying decision trees, International Journal of Man-Machine Studies Volume 27, Issue 3, September 1987, Pages 221-234.

Quinlan, J.R. Induction of decision trees. Mach Learn 1, 81–106 (1986), https://doi.org/10.1007/BF00116251.

Quinlan, J. R., & Cameron-Jones, R. M. (1993, April). FOIL: A midterm report. In European conference on machine learning (pp. 1-20). Springer, Berlin, Heidelberg.

S. Graja and J. -. Boucher, "Hidden Markov tree model applied to ECG delineation," in IEEE Transactions on Instrumentation and Measurement, vol. 54, no. 6, pp. 2163-2168, Dec. 2005.

ALKHATEEB, Faisal, BAGET, Jean-François, et EUZENAT, Jérôme. Extending SPARQL with regular expression patterns (for querying RDF). Journal of web semantics, 2009, vol. 7, no 2, p. 57-73.

LI, Jia et WANG, James. System and method for automatic linguistic indexing of images by a statistical modeling approach. U.S. Patent No 7,394,947, 1 juill. 2008.

Wang, X. H., Liu, A., & Zhang, S. Q. (2015). New facial expressionrecognition based on FSVM and KNN. Optik-International Journal for Light and Electron Optics, 126(21), 3132–3134.

Cherif, Walid, Optimization of K-NN algorithm by clustering and reliability coefficients: application to breast-cancer diagnosis The First International Conference On Intelligent Computing in Data Sciences (ICDS2017) Procedia Computer Science 127, 293-299.

DOI: http://doi.org/10.11591/ijeecs.v21.i1.pp%25p
Total views : 1 times


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

shopify stats IJEECS visitor statistics