A Kind of Visual Speech Feature with the Geometric and Local Inner Texture Description

Xibin Jia, Yanfeng Sun


In this paper, we propose a type of joint feature with geometric parameters and color moments to represent the speaking-mouth frames for image-based visual speech synthesis systems. Based on FDP around the mouth area, the geometric feature is obtained by computing Euclidean distances to describe the width of the speaking mouth, the height of the outer and inner lips and the distances between them. The color moment component in the joint feature is obtained by calculating the texture between the upper and lower inner lips to describe the visibility state of the teeth. Through analyzing the accordance between the teeth visibility and the components of RGB and HSV color space based on the samples separately, we discovered that green and blue components are good at describing the change of teeth visibility. The experiments show that the proposed joint feature can effectively provide the basis for categorizing the different speaking states especially at the sense of lip shapes and tooth visibility. The evaluation of clustering results is done by analyzing the derived parameters of the silhouette function.  The analyzing results prove that comparing with the geometric only and PCA, our proposed feature together with the shape and the local inner lip texture clues has better performance in improving the similarity between samples within the clusters. In the future, more expressive features with the shape and local texture information should be explored to increase the proportion of similar samples within the clusters to improve the descriptive ability of speaking mouths.


DOI: http://dx.doi.org/10.11591/telkomnika.v11i2.2047

