Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features

Putu Harry Gunawan; Yesy Diah Rosita; Yohana Wuri Satwika; Rifki Wijaya; Tjokorda Agung B. Wirayuda; Arfin Nurma Halida; Asril Jarin; Insan Ramadhan; Irgi Ahmad Maulana; Wandi Yusuf Kurniawan

doi:10.35377/saucis...1728490

EN

Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features

Abstract

Speech emotion recognition (SER) is a key challenge in affective computing, where subtle emotional cues are often embedded not in the linguistic content of speech but in the voice-related acoustic features. This study proposes a machine learning approach that leverages statistical descriptors of Mel-Frequency Cepstral Coefficients (MFCCs) to capture the central tendencies of voice signals for multiclass emotion classification. Raw voice from the Toronto Emotional Speech Set (TESS) was processed into nine statistical features, of which six were retained after correlation-based filtering to reduce redundancy and improve generalization. Several classifiers were evaluated, with Support Vector Machine (SVM) achieving the best performance: 84% accuracy, 83% macro-recall, and 83% macro-F1. The improvements after hyperparameter tuning were statistically significant (McNemar’s test, p = 1.606e-20), underscoring the importance of systematic optimization. A comparative analysis revealed that correlation-based feature selection outperformed PCA and LDA in preserving the discriminative power of SVM. Compared with related works that employ deep learning or multi-dataset setups, the proposed framework offers competitive performance while maintaining greater interpretability and computational efficiency. These findings validate the hypothesis that compact, voice-centered statistical features, when optimized, form a reliable basis for robust and efficient emotion recognition systems.

Keywords

Supporting Institution

Center of Excellent Human Centric Engineering (HUMIC), Telkom University and National Research and Innovation Agency (BRIN)

Project Number

Decree Number 61/II.7/HK/2024 dated 24 December 2024 and Agreement/Contract Numbers 47/IV/KS/02/2025 and 052/SAM4/PPM/2025 with Telkom University dated 21 February 2025

Ethical Statement

This study was conducted in full compliance with established scientific and ethical standards. All referenced materials have been properly acknowledged and cited in the bibliography.

Thanks

This research was supported by the RIIM LPDP Grant and BRIN under Grant Number 61/II.7/HK/2024 dated 24 December 2024 and Agreement/Contract Numbers 47/IV/KS/02/2025 and 052/SAM4/PPM/2025 with Telkom University dated 21 February 2025. The authors would also like to express their sincere gratitude to Telkom University for its institutional support, as well as to Madrasah Aliyah Swasta Teknologi Informasi Berlian and other partners who prefer to remain anonymous for their assistance in providing the research site and respondents.

References

A. Koduru, H. Valiveti, and A. Budati, “Feature extraction algorithms to improve the speech emotion recognition rate,” International Journal of Speech Technology, vol. 23, pp. 45–55, 2020. doi: 10.1007/s10772-020-09672-4
P. Foggia, A. Greco, A. Roberto, A. Saggese, and M. Vento, “Identity, gender, age, and emotion recognition from speaker voice with multi-task deep networks for cognitive robotics,” Cognitive Computation, vol. 16, no. 5, pp. 2713–2723, 2024. doi: 10.1007/s12559-023-10241-5
D. Keltner, D. Sauter, J. Tracy, and A. Cowen, “Emotional expression: Advances in basic emotion theory,” Journal of Nonverbal Behavior, vol. 43, pp. 133–160, 2019. doi: 10.1007/s10919-019-00293-3
L. F. Weyher, “Re-reading sociology via the emotions: Karl Marx's theory of human nature and estrangement,” Sociological Perspectives, vol. 55, no. 2, pp. 341–363, 2012. doi: 10.1525/sop.2012.55.2.341
X. Zhu, C. Guo, H. Feng, Y. Huang, Y. Feng, X. Wang, and R. Wang, “A review of key technologies for emotion analysis using multimodal information,” Cognitive Computation, vol. 16, no. 4, pp. 1504–1530, 2024. doi: 10.1007/s12559-024-10287-z
H. Aouani, and Y. B. Ayed, “Speech emotion recognition with deep learning,” in Proc. 24th Int. Conf. Knowledge-Based and Intelligent Information & Engineering Systems (KES 2020), Procedia Computer Science, vol. 176, pp. 251–260, 2020. doi: 10.1016/j.procs.2020.08.027.
A. K. Pagidirayi, and A. Bhuma, “Speech emotion recognition using machine learning techniques,” Revue D'intelligence Artificielle, vol. 36, no. 2, pp. 271–278, 2022. doi: 10.18280/ria.360211
G. Ajay, M. Siddhesh, S. Mukul, and C. Supriya, “Speech based emotion recognition using machine learning,” International Research Journal of Engineering and Technology, vol. 08, no. 4, pp. 3289–3295, 2021.

A. Osipov, E. Pleshakova, Y. Liu, and S. Gataullin, “Machine learning methods for speech emotion recognition on telecommunication systems,” Journal of Computer Virology and Hacking Techniques, vol. 20, no. 3, pp. 415–428, 2024. doi: 10.1007/s11416-023-00500-2
K. Daqrouq, A. Balamesh, O. Alrusaini, A. Alkhateeb, and A. Balamash, “Emotion modeling in speech signals: Discrete wavelet transform and machine learning tools for emotion recognition system,” Applied Computational Intelligence and Soft Computing, vol. 2024, no. 1, p. 7184018, 2024. doi: 10.1155/2024/7184018
Y. Ü. Sönmez, and A. Varol, “In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–,” Intelligent Systems with Applications, vol. 22, p. 200351, 2024. doi: 10.1016/j.iswa.2024.200351
A. Vyakaranam, T. Maul, and B. Ramayah, “A review on speech emotion recognition for late deafened educators in online education,” International Journal of Speech Technology, vol. 27, no. 1, pp. 29–52, 2024. doi: 10.1007/s10772-023-10064-7
A. Hashem, M. Arif, and M. Alghamdi, “Speech emotion recognition approaches: A systematic review,” Speech Communication, vol. 154, p. 102974, 2023. doi: 10.1016/j.specom.2023.102974
T. Dimitrova-Grekow, A. Klis, and M. Igras-Cybulska, “Speech emotion recognition based on voice fundamental frequency,” Archives of Acoustics, vol. 44, no. 2, pp. 277–286, 2019. doi: 10.24425/aoa.2019.128491
R. Jahangir, Y. W. Teh, F. Hanif, and G. Mujtaba, “Deep learning approaches for speech emotion recognition: State of the art and research challenges,” Multimedia Tools and Applications, vol. 80, no. 16, pp. 23745–23812, 2021. doi: 10.1007/s11042-020-09874-7
S. Pal, S. Mukhopadhyay, and N. Suryadevara, “Development and progress in sensors and technologies for human emotion recognition,” Sensors, vol. 21, no. 16, p. 5554, 2021. doi: 10.3390/s21165554
A. Alslaity, and R. Orji, “Machine learning techniques for emotion detection and sentiment analysis: Current state, challenges, and future directions,” Behaviour & Information Technology, vol. 43, no. 1, pp. 139–164, 2024. doi: 10.1080/0144929X.2022.2156387
A. M. Maithri, U. Raghavendra, A. Gudigar, J. Samanth, P. D. Barua, M. Murugappan, Y. Chakole, and U. R. Acharya, “Automated emotion recognition: Current trends and future perspectives,” Computer Methods and Programs in Biomedicine, vol. 215, p. 106646, 2022. doi: 10.1016/j.cmpb.2022.106646
X. Ke, Y. Zhu, L. Wen, and W. Zhang, “Speech emotion recognition based on svm and ann,” International Journal of Machine Learning and Computing, vol. 8, no. 3, pp. 198–202, 2018. doi: 10.18178/ijmlc.2018.8.3.687
O. U. Kumala, and A. Zahra, “Indonesian speech emotion recognition using cross-corpus method with the combination of MFCC and teager energy features,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 4, pp. 163–168, 2021. doi: 10.14569/IJACSA.2021.0120422
Y. Tanoko, and A. Zahra, “Multi feature stacking order impact on speech emotion recognition performance,” Bulletin of Electrical Engineering and Informatics, vol. 11, no. 6, pp. 3272–3278, 2022. doi: 10.11591/eei.v11i6.4287
M. R. A. Borgalli, and S. Surve, “Deep learning for facial emotion recognition using custom CNN architecture,” Journal of Physics: Conference Series, vol. 2236, no. 1, art. no. 012004, 2022. doi: 10.1088/1742-6596/2236/1/012004
R. Chaudhary, S. Saraswat, S. Chaturvedi, and P. Naregalkar, “Speech emotion recognition using neural network,” International Journal of Scientific Research in Engineering and Management, vol. 4, no. 8, p. 5, 2020.
M. K. Pichora-Fuller, and K. Dupuis, Toronto emotional speech set (TESS), version V1. Borealis, Feb. 13, 2020. [Online]. Available: https://doi.org/10.5683/SP2/E8H2MF
B. Bischl et al., “Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges,” WIREs Data Mining and Knowledge Discovery, vol. 13, no. 2, e1484, 2023. doi: 10.1002/widm.1484
B. McFee et al., “librosa,” 2025. [Online]. Available: https://doi.org/10.5281/zenodo.15006942
J. Tanha, Y. Abdi, N. Samadi, N. Razzaghi, and M. Asadpour, “Boosting methods for multi-class imbalanced data classification: An experimental review,” Journal of Big Data, vol. 7, Art. No. 70, pp. 1–47, 2020. doi: 10.1186/s40537-020-00349-y
S. Reakaa, and H. Jeganathan, “Comparison study on speech emotion prediction using machine learning,” Journal of Physics: Conference Series, vol. 1921, p. 012017, 05 2021. doi: 10.1088/1742-6596/1921/1/012017

Details

Primary Language

English

Subjects

Software Engineering (Other)

Journal Section

Research Article

Authors

Putu Harry Gunawan
0000-0002-3635-894X
Indonesia

Yesy Diah Rosita ^*
0000-0003-3614-6725
Indonesia

Yohana Wuri Satwika
0009-0004-8904-1470
Indonesia

Rifki Wijaya
0000-0002-8247-6584
Indonesia

Tjokorda Agung B. Wirayuda
0000-0002-5408-7004
Indonesia

Arfin Nurma Halida
0009-0002-2965-7789
Indonesia

Asril Jarin
0000-0001-8360-8166
Indonesia

Insan Ramadhan
0009-0000-4250-7940
Indonesia

Irgi Ahmad Maulana
0009-0000-8868-1215
Indonesia

Wandi Yusuf Kurniawan
0009-0009-6027-4250
Indonesia

Early Pub Date

March 15, 2026

Publication Date

March 15, 2026

Submission Date

July 4, 2025

Acceptance Date

October 6, 2025

Published in Issue

Year 2026 Volume: 9 Number: 1

DOI

https://doi.org/10.35377/saucis...1728490

IZ

https://izlik.org/JA29JX37XD

Cite

RIS / Bibtex

APA

Gunawan, P. H., Rosita, Y. D., Satwika, Y. W., Wijaya, R., Wirayuda, T. A. B., Halida, A. N., Jarin, A., Ramadhan, I., Maulana, I. A., & Kurniawan, W. Y. (2026). Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features. Sakarya University Journal of Computer and Information Sciences, 9(1), 21-33. https://doi.org/10.35377/saucis...1728490

AMA

1.Gunawan PH, Rosita YD, Satwika YW, et al. Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features. SAUCIS. 2026;9(1):21-33. doi:10.35377/saucis.1728490

Chicago

Gunawan, Putu Harry, Yesy Diah Rosita, Yohana Wuri Satwika, et al. 2026. “Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features”. Sakarya University Journal of Computer and Information Sciences 9 (1): 21-33. https://doi.org/10.35377/saucis. 1728490.

EndNote

Gunawan PH, Rosita YD, Satwika YW, Wijaya R, Wirayuda TAB, Halida AN, Jarin A, Ramadhan I, Maulana IA, Kurniawan WY (March 1, 2026) Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features. Sakarya University Journal of Computer and Information Sciences 9 1 21–33.

IEEE

[1]P. H. Gunawan et al., “Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features”, SAUCIS, vol. 9, no. 1, pp. 21–33, Mar. 2026, doi: 10.35377/saucis...1728490.

ISNAD

Gunawan, Putu Harry - Rosita, Yesy Diah - Satwika, Yohana Wuri - Wijaya, Rifki - Wirayuda, Tjokorda Agung B. - Halida, Arfin Nurma - Jarin, Asril - Ramadhan, Insan - Maulana, Irgi Ahmad - Kurniawan, Wandi Yusuf. “Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features”. Sakarya University Journal of Computer and Information Sciences 9/1 (March 1, 2026): 21-33. https://doi.org/10.35377/saucis. 1728490.

JAMA

1.Gunawan PH, Rosita YD, Satwika YW, Wijaya R, Wirayuda TAB, Halida AN, Jarin A, Ramadhan I, Maulana IA, Kurniawan WY. Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features. SAUCIS. 2026;9:21–33.

MLA

Gunawan, Putu Harry, et al. “Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features”. Sakarya University Journal of Computer and Information Sciences, vol. 9, no. 1, Mar. 2026, pp. 21-33, doi:10.35377/saucis. 1728490.

Vancouver

1.Putu Harry Gunawan, Yesy Diah Rosita, Yohana Wuri Satwika, Rifki Wijaya, Tjokorda Agung B. Wirayuda, Arfin Nurma Halida, Asril Jarin, Insan Ramadhan, Irgi Ahmad Maulana, Wandi Yusuf Kurniawan. Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features. SAUCIS. 2026 Mar. 1;9(1):21-33. doi:10.35377/saucis. 1728490

Öz

Machine Learning-Based Emotion Classification from Voice Signals Using MFCC Central Tendency Features

Abstract

Keywords

Supporting Institution

Project Number

Ethical Statement

Thanks

References

Details

Primary Language

Subjects

Journal Section

Authors

Early Pub Date

Publication Date

Submission Date

Acceptance Date

Published in Issue

DOI

IZ

Cite