Multi-Objective Heuristic Feature Selection for Speech-Based Multilingual Emotion Recognition

Christina Brester 1 , Eugene Semenkin 1  and Maxim Sidorov 2
  • 1 Institute of Computer Science and Telecommunications, Reshetnev Siberian State Aerospace University, Krasnoyarsky rabochy Av. 31, 660037, Krasnoyarsk, Russian Federation
  • 2 Institute of Communications Engineering, Ulm University, Albert Einstein-Allee 43, 89081, Ulm, Germany


If conventional feature selection methods do not show sufficient effectiveness, alternative algorithmic schemes might be used. In this paper we propose an evolutionary feature selection technique based on the two-criterion optimization model. To diminish the drawbacks of genetic algorithms, which are applied as optimizers, we design a parallel multicriteria heuristic procedure based on an island model. The performance of the proposed approach was investigated on the Speech-based Emotion Recognition Problem, which reflects one of the most essential points in the sphere of human-machine communications. A number of multilingual corpora (German, English and Japanese) were involved in the experiments. According to the results obtained, a high level of emotion recognition was achieved (up to a 12.97% relative improvement compared with the best F-score value on the full set of attributes).

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] R. Kohavi, G.H. John, Wrappers for feature subset selection. Artificial Intelligence, 97, pp. 273-324, 1997.

  • [2] S.B. Thrun, The Monk’s problems: a performance comparison of different learning algorithms, Tech. Rept. CMU-CS-91-197, Carnegie Mellon University, Pittsburgh, PA, 1991.

  • [3] G.H. John, Enhancements to the data mining process. Ph.D. Thesis, Computer Science Department, Stanford University, CA, 1997.

  • [4] M. Venkatadri, K. Srinivasa Rao, A multiobjective genetic algorithm for feature selection in data mining, International Journal of Computer Science and Information Technologies, vol. 1, no. 5, 2010, pp. 443-448.

  • [5] Ch. Brester, M. Sidorov, E. Semenkin, Acoustic Emotion Recognition: TwoWays of Feature Selection Based on Self-Adaptive Multi-Objective Genetic Algorithm, Proceedings of the International Conference on Informatics in Control, Automation and Robotics (ICINCO), 2014, pp. 851-855.

  • [6] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary Computation 6 (2), 2002, pp. 182-197.

  • [7] R. Wang, Preference-Inspired Co-evolutionary Algorithms, A thesis submitted in partial fulfillment for the degree of the Doctor of Philosophy, University of Sheffield, 2013, p. 231.

  • [8] E. Zitzler, M. Laumanns, L. Thiele, SPEA2: Improving the Strength Pareto Evolutionary Algorithm for Multiobjective Optimization, Evolutionary Methods for Design Optimisation and Control with Application to Industrial Problems EUROGEN 2001 3242 (103), 2002, pp. 95-100.

  • [9] D. Whitley, S. Rana, and R. Heckendorn, Island model genetic algorithms and linearly separable problems, Proceedings of AISBWorkshop on Evolutionary Computation, Manchester, UK. Springer, volume 1305 of LNCS, 1997, pp. 109-125.

  • [10] Ch. Brester, E. Semenkin, Cooperative Multiobjective Genetic Algorithm with Parallel Implementation // Advances in Swarm and Computational Intelligence, LNCS 9140, 2015, pp. 471-478.

  • [11] R.W. Picard, Affective computing. Tech. Rep. Perceptual Computing Section Technical Report No. 321, MIT Media Laboratory, 20 Ames St., Cambridge, MA 02139, 1995.

  • [12] P. Boersma, Praat, a system for doing phonetics by computer, Glot international, vol. 5, no. 9/10, 2002, pp. 341-345.

  • [13] F. Eyben, M. Wllmer, and B. Schuller, Opensmile: the Munich versatile and fast opensource audio feature extractor, Proceedings of the international conference on Multimedia, 2010. ACM, pp. 1459-1462.

  • [14] F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss, A database of german emotional speech, In Interspeech, 2005, pp. 1517-1520.

  • [15] S. Haq, P. Jackson, Machine Audition: Principles, Algorithms and Systems, chapter Multimodal Emotion Recognition, IGI Global, Hershey PA, Aug. 2010, pp. 398-423.

  • [16] A. Schmitt, S. Ultes, and W. Minker, A parameterized and annotated corpus of the cmu let’s go bus information system, Proceedings of International Conference on Language Resources and Evaluation (LREC), 2012.

  • [17] H. Mori, T. Satake, M. Nakamura, and H. Kasuya, Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics, Speech Communication, 53, 2011.

  • [18] Ch. Brester, M. Sidorov, E. Semenkin, Speechbased emotion recognition: Application of collective decision making concepts, Proceedings of the 2nd International Conference on Computer Science and Artificial Intelligence (ICCSAI2014), 2014, pp. 216-220.

  • [19] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H.Witten, The WEKA Data Mining Software: An Update, SIGKDD Explorations, Vol. 11, Issue 1, 2009.

  • [20] C. Goutte, E. Gaussier, A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. ECIR’05 Proceedings of the 27th European conference on Advances in Information Retrieval Research, 2005, pp. 345-359.


Journal + Issues