Combining visual and acoustic features for bird species classification


In this paper a novel approach for automatic bird species classification is described. The proposed strategy is based on features taken from the textural content of spectrogram images of bird vocalizations. We show how several texture descriptors can be used for representing the spectrograms. The following approaches are tested here with spectrograms for the first time: Local Ternary Phase Quantization, Heterogeneous Auto-Similarities of Characteristics, and an ensemble of variants of Local Binary Pattern Histogram Fourier. Combining this set of descriptors greatly increases classification performance and markedly improves previous ensembles of texture descriptors used for describing a spectrogram. Moreover, a further improvement is obtained when the texture descriptors are combined with the acoustic features. SVM classifiers are used in the classification step, with final results computed using 10-fold cross-validation. For a fair comparison with other methods in the literature, the experiments are performed on a benchmark database composed of 46 bird species used for this classification task. The best accuracy rate obtained is about 94.5%. The MATLAB code1 is publicly available to other researchers for future comparisons, as well as the database2 used in the experiments.


Information Technology and Cybersecurity

Document Type

Conference Proceeding



Publication Date


Journal Title

Proceedings - 2016 IEEE 28th International Conference on Tools with Artificial Intelligence, ICTAI 2016