Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction
The classification of environmental sounds is important for emerging applications such as automatic audio surveillance, audio forensics, and robot navigation. Existing techniques combined multiple features and stacked many CNN layers (very deep learning) to reach the desired accuracy. Instead of usi...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English English |
Published: |
Springer Nature
2021
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/90215/7/90215_Rethinking%20environmental%20sound%20classification%20using%20convolutional%20neural%20networks_SCOPUS.pdf http://irep.iium.edu.my/90215/8/90215_Rethinking%20environmental%20sound%20classification%20using%20convolutional%20neural%20networks.pdf http://irep.iium.edu.my/90215/ https://link.springer.com/article/10.1007/s00521-021-06091-7 https://doi.org/10.1007/s00521-021-06091-7 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The classification of environmental sounds is important for emerging applications such as automatic audio surveillance, audio forensics, and robot navigation. Existing techniques combined multiple features and stacked many CNN layers (very deep learning) to reach the desired accuracy. Instead of using many features and going deeper by stacking layers that are resource extensive, this paper proposes a novel technique that uses only a single feature, namely the Mel-Frequency Cepstral Coefficient (MFCC) and just three layers of CNN. We demonstrate that such a simple network can considerably outperform several conventional and deep learning-based
algorithms. Through a carefully and empirically parameters fine-tuning of the data input, we reported a model that is significantly less complex in the architecture yet has recorded a similar accuracy of 95.59% compared to state-of-the-art deep models on UrbanSound8k dataset. We conjecture that our accurate lightweight model is an excellent environmental sound recognizer for the application on resource-constraint embedded platform. |
---|