"Ensemble of convolutional neural networks to improve animal audio clas" by Loris Nanni, Yandre M.G. Costa et al.

College of Business

Title

Ensemble of convolutional neural networks to improve animal audio classification

Authors

Loris Nanni
Yandre M.G. Costa
Rafael L. Aguiar
Rafael B. Mangolin
Sheryl Brahnam, Missouri State UniversityFollow
Carlos N. Silla

Abstract

In this work, we present an ensemble for automated audio classification that fuses different types of features extracted from audio files. These features are evaluated, compared, and fused with the goal of producing better classification accuracy than other state-of-the-art approaches without ad hoc parameter optimization. We present an ensemble of classifiers that performs competitively on different types of animal audio datasets using the same set of classifiers and parameter settings. To produce this general-purpose ensemble, we ran a large number of experiments that fine-tuned pretrained convolutional neural networks (CNNs) for different audio classification tasks (bird, bat, and whale audio datasets). Six different CNNs were tested, compared, and combined. Moreover, a further CNN, trained from scratch, was tested and combined with the fine-tuned CNNs. To the best of our knowledge, this is the largest study on CNNs in animal audio classification. Our results show that several CNNs can be fine-tuned and fused for robust and generalizable audio classification. Finally, the ensemble of CNNs is combined with handcrafted texture descriptors obtained from spectrograms for further improvement of performance. The MATLAB code used in our experiments will be provided to other researchers for future comparisons at https://github.com/LorisNanni.

Department(s)

Information Technology and Cybersecurity

Document Type

Article

DOI

https://doi.org/10.1186/s13636-020-00175-3

Rights Information

© 2020 the Authors. 2020 This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original authors and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

Keywords

Audio classification, Deep learning, Ensemble of classifiers, Handcrafted features, Pattern recognition, Texture

Publication Date

12-1-2020

Recommended Citation

Nanni, Loris, Yandre MG Costa, Rafael L. Aguiar, Rafael B. Mangolin, Sheryl Brahnam, and Carlos N. Silla. "Ensemble of convolutional neural networks to improve animal audio classification." EURASIP Journal on Audio, Speech, and Music Processing 2020 (2020): 1-14.

Journal Title

Eurasip Journal on Audio, Speech, and Music Processing

Download

COinS

College of Business

Title

Authors

Abstract

Department(s)

Document Type

DOI

Rights Information

Keywords

Publication Date

Recommended Citation

Journal Title

Browse

Search

Author Corner

College of Business

Title

Authors

Abstract

Department(s)

Document Type

DOI

Rights Information

Keywords

Publication Date

Recommended Citation

Journal Title

Share

Browse

Search

Author Corner