Speech Emotion Recognition Using Unsupervised Feature Selection Algorithms
Abstract
The use of the combination of different speech features is a common practice to improve the accuracy of Speech Emotion Recognition (SER). Sometimes, this leads to an abrupt increase in the processing time and some of these features contribute less to emotion recognition often resulting in an incorrect prediction of emotion due to which the accuracy of the SER system decreases substantially. Hence, there is a need to select the appropriate feature set that can contribute significantly to emotion recognition. This paper presents the use of Feature Selection with Adaptive Structure Learning (FSASL) and Unsupervised Feature Selection with Ordinal Locality (UFSOL) algorithms for feature dimension reduction to improve SER performance with reduced feature dimension. A novel Subset Feature Selection (SuFS) algorithm is proposed to reduce further the feature dimension and achieve a comparable better accuracy when used along with the FSASL and UFSOL algorithms. 1582 INTERSPEECH 2010 Paralinguistic, 20 Gammatone Cepsfral Coefficients and Support Vector Machine classifier with 10-Fold Cross-Validation and Hold-Out Validation are considered in this work. The EMO-DB and IEMOCAP databases are used to evaluate the performance of the proposed SER system in terms of classification accuracy and computational time. From the result analysis, it is evident that the proposed SER system outperforms the existing ones.
Collections
- Makale [92796]