Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination

Timur Düzenli; Nalan Özkurt

Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination

Yıl 2011, Cilt: 11 Sayı: 1, 1355 - 1362, 28.03.2012

Öz

The speech/music discrimination systems have gaining importance in several intelligent audio retrieval algorithms due to the increasing size of the multimedia sources in our daily lives. This study aims to propose a speech/music discrimination system which utilizes the advantages of the wavelet transform. Also, the performance of the discrete wavelet transform and the dual- tree wavelet transform has been compared with the conventional time, frequency and cepstral domain features used in speech/music discrimination. The speech and music samples collected from common databases, CD recording and internet radios have been classified with artificial neural networks with different feature sets. The principal component analysis has been applied to eliminate the correlated features before classification stage. Considering the number of vanishing moments and orthogonality, the best performance has been obtained with Daubechies8 wavelet among the other members of the Daubechies family. According to the results, the proposed feature set outperforms the traditional ones.

Keywords: Speech/music discrimination, Discrete wavelet transform, Dual-tree wavelet transform, Daubechies mother wavelet.

Anahtar Kelimeler

Kaynakça

Ambikairajah, O. M. E., Epps, J., “Novel features for effective speech and music discrimination,” in Proc. IEEE Int. Conf. on Engineering of Intelligent Systems, pp. 1–5, 2006.
Exposito, N. R. J.E.M., Galan, S.G., Candeas, P., “Audio coding improvement using evolutionary speech/music discrimination,” in Proc. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), pp. 1–6, 2007.
El-Maleh, K., Petrucci, M. G., Kabal, P., “Speech/music discrimination for multimedia applications,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2445–2448, 2000.
Gedik, A., Bozkurt, B., “Pitch frequency histogram based music information retrieval for turkish music,” Signal Processing, vol. 10, pp. 1049–1063, 2010.
Saunders, J., “Real time discrimination of broadcast speech/music,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 993–996, 1996.
Scheier, E., Slaney, M., “Construction and evaluation of a robust multifeature speech/music discriminator,” in Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, ICASSP’97, pp. 1331–1334, 1997
Ajmera, I. M. J., Bourlard, H., “Speech/music segmentation using entropy and dynamism features in a HMM classification framework,” Speech Communication, vol. 40, pp. 351–363, 2003.
Panagiotakis, C., Tziritas, G., “A speech/music discriminator based on RMS and zero-crossings,” IEEE Trans. Multimedia, vol. 7, pp. 155–166, 2005.
Tzanetakis, G. E. G., Cook, P., “Audio analysis using the discrete wavelet transform,” in Proc. Conf. in Acoustics and Music Theory Applications. WSES, pp. 318–323, 2001.
Didiot E., Illina, I., Fohr, D., Mella, O., “A wavelet- based parameterization for speech/music discrimination,” Computer Speech and Language, vol. 24, pp. 341–357, 2010. [11] Ntalampiras, S., Fakotakis, N., “Speech /music discrimination based on discrete wavelet transform,” in Proc. of 5th Hell. Conf. On Art.Int., SETN’08, LNAI 5138, Greece, Oct. 2008, pp. 205–211, 2008
Khan, M., Al-Khatib, W., “Machine-learning based classiŞcation of speech and music,” ACM Jour. on Multimedia Systems, vol. 12, pp. 55–67, 2006.
Mallat, S., A wavelet tour of signal processing. Academic Press, 1999
Zheng, F., Zhang, G., Song, Z., “Comparison of different implemantations of mfcc,” Arch. Rat. Mech. Anal., vol. 16, pp. 582–589, 2001.
Selesnick, I.W., Baraniuk, R.G., Kingsbury, N.G. “The Dual-Tree ComplexWavelet Transform”, IEEE Sig.Proc. Mag. 22, pp. 123–151, 2005.
Kingsbury, N.G., “The dual-tree complex wavelet transform: a new technique for shift invariance and directional Şlters”, Proc. of the IEEE Digital Signal Processing Workshop, 1998.
Düzenli, T., (2010). Classification of Speech and Musical Signals Using Wavelet Domain Features, MSc. Thesis submitted to Dokuz Eylül University, Graduate School Of Natural And Applied Sciences.
Charalambous, C., Conjugate gradient algorithm for efficient training of artificial neural networks. IEEE Proceedings-G on Circuit Devices and System, 139 (3), pp. 301- 310, 1992
A. Toker, S. Özcan, H. Kuntman, O. Çiçekoğlu, “Supplementary all-pass sections with reduced number of passive elements using a single current conveyor”, Int J of Electronics, vol.88, pp.969-976,2001.
U. Çam, O. Çiçekoğlu, M. Gülsoy, H. Kuntman, “New voltage and current mode first-order all-pass filters using single FTFN”, Frequenz, vol.7-8, pp.177-179,2000.
R. Schauman, M. E. Valkenburg, “Design of analog filters”, Oxford University Press, New York, 2001.
Nalan Özkurt received her B.S., M.S. and Ph.D. degree in Electrical
Engineering from the Dokuz Eylul University, in 1994, 1998 and 2004, respectively. She is currently an assistant professor in the Department of Electrical Engineering at
Yaşar University. Her research interests are wavelets, nonlinear static and dynamical systems, chaos. She is a member of Association of Electrical and Electronic Engineers of Turkey.
Timur Düzenli received his B.S. in 2007 and his M.S. in 2010, both
Electronics Engineering, from Dokuz Eylul University. He is currently a Ph.D. student at the same department. Her research interests are wavelets, time- frequency analysis, and digital communication systems. and

Yıl 2011, Cilt: 11 Sayı: 1, 1355 - 1362, 28.03.2012

Timur Düzenli Nalan Özkurt

Öz

Kaynakça

Ambikairajah, O. M. E., Epps, J., “Novel features for effective speech and music discrimination,” in Proc. IEEE Int. Conf. on Engineering of Intelligent Systems, pp. 1–5, 2006.
Exposito, N. R. J.E.M., Galan, S.G., Candeas, P., “Audio coding improvement using evolutionary speech/music discrimination,” in Proc. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), pp. 1–6, 2007.
El-Maleh, K., Petrucci, M. G., Kabal, P., “Speech/music discrimination for multimedia applications,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2445–2448, 2000.
Gedik, A., Bozkurt, B., “Pitch frequency histogram based music information retrieval for turkish music,” Signal Processing, vol. 10, pp. 1049–1063, 2010.
Saunders, J., “Real time discrimination of broadcast speech/music,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 993–996, 1996.
Scheier, E., Slaney, M., “Construction and evaluation of a robust multifeature speech/music discriminator,” in Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, ICASSP’97, pp. 1331–1334, 1997
Ajmera, I. M. J., Bourlard, H., “Speech/music segmentation using entropy and dynamism features in a HMM classification framework,” Speech Communication, vol. 40, pp. 351–363, 2003.
Panagiotakis, C., Tziritas, G., “A speech/music discriminator based on RMS and zero-crossings,” IEEE Trans. Multimedia, vol. 7, pp. 155–166, 2005.
Tzanetakis, G. E. G., Cook, P., “Audio analysis using the discrete wavelet transform,” in Proc. Conf. in Acoustics and Music Theory Applications. WSES, pp. 318–323, 2001.
Didiot E., Illina, I., Fohr, D., Mella, O., “A wavelet- based parameterization for speech/music discrimination,” Computer Speech and Language, vol. 24, pp. 341–357, 2010. [11] Ntalampiras, S., Fakotakis, N., “Speech /music discrimination based on discrete wavelet transform,” in Proc. of 5th Hell. Conf. On Art.Int., SETN’08, LNAI 5138, Greece, Oct. 2008, pp. 205–211, 2008
Khan, M., Al-Khatib, W., “Machine-learning based classiŞcation of speech and music,” ACM Jour. on Multimedia Systems, vol. 12, pp. 55–67, 2006.
Mallat, S., A wavelet tour of signal processing. Academic Press, 1999
Zheng, F., Zhang, G., Song, Z., “Comparison of different implemantations of mfcc,” Arch. Rat. Mech. Anal., vol. 16, pp. 582–589, 2001.
Selesnick, I.W., Baraniuk, R.G., Kingsbury, N.G. “The Dual-Tree ComplexWavelet Transform”, IEEE Sig.Proc. Mag. 22, pp. 123–151, 2005.
Kingsbury, N.G., “The dual-tree complex wavelet transform: a new technique for shift invariance and directional Şlters”, Proc. of the IEEE Digital Signal Processing Workshop, 1998.
Düzenli, T., (2010). Classification of Speech and Musical Signals Using Wavelet Domain Features, MSc. Thesis submitted to Dokuz Eylül University, Graduate School Of Natural And Applied Sciences.
Charalambous, C., Conjugate gradient algorithm for efficient training of artificial neural networks. IEEE Proceedings-G on Circuit Devices and System, 139 (3), pp. 301- 310, 1992
A. Toker, S. Özcan, H. Kuntman, O. Çiçekoğlu, “Supplementary all-pass sections with reduced number of passive elements using a single current conveyor”, Int J of Electronics, vol.88, pp.969-976,2001.
U. Çam, O. Çiçekoğlu, M. Gülsoy, H. Kuntman, “New voltage and current mode first-order all-pass filters using single FTFN”, Frequenz, vol.7-8, pp.177-179,2000.
R. Schauman, M. E. Valkenburg, “Design of analog filters”, Oxford University Press, New York, 2001.
Nalan Özkurt received her B.S., M.S. and Ph.D. degree in Electrical
Engineering from the Dokuz Eylul University, in 1994, 1998 and 2004, respectively. She is currently an assistant professor in the Department of Electrical Engineering at
Yaşar University. Her research interests are wavelets, nonlinear static and dynamical systems, chaos. She is a member of Association of Electrical and Electronic Engineers of Turkey.
Timur Düzenli received his B.S. in 2007 and his M.S. in 2010, both
Electronics Engineering, from Dokuz Eylul University. He is currently a Ph.D. student at the same department. Her research interests are wavelets, time- frequency analysis, and digital communication systems. and

Toplam 25 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Bölüm	Makaleler
Yazarlar	Timur Düzenli Bu kişi benim Nalan Özkurt
Yayımlanma Tarihi	28 Mart 2012
Yayımlandığı Sayı	Yıl 2011 Cilt: 11 Sayı: 1

Kaynak Göster

APA	Düzenli, T., & Özkurt, N. (2012). Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering, 11(1), 1355-1362.
AMA	Düzenli T, Özkurt N. Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering. Mart 2012;11(1):1355-1362.
Chicago	Düzenli, Timur, ve Nalan Özkurt. “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”. IU-Journal of Electrical & Electronics Engineering 11, sy. 1 (Mart 2012): 1355-62.
EndNote	Düzenli T, Özkurt N (01 Mart 2012) Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering 11 1 1355–1362.
IEEE	T. Düzenli ve N. Özkurt, “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”, IU-Journal of Electrical & Electronics Engineering, c. 11, sy. 1, ss. 1355–1362, 2012.
ISNAD	Düzenli, Timur - Özkurt, Nalan. “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”. IU-Journal of Electrical & Electronics Engineering 11/1 (Mart 2012), 1355-1362.
JAMA	Düzenli T, Özkurt N. Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering. 2012;11:1355–1362.
MLA	Düzenli, Timur ve Nalan Özkurt. “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”. IU-Journal of Electrical & Electronics Engineering, c. 11, sy. 1, 2012, ss. 1355-62.
Vancouver	Düzenli T, Özkurt N. Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering. 2012;11(1):1355-62.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin