dc.description.abstract |
In this project we focus on different deep Learning algorithms for noisy audio enhancement
where traditional Digital signal Processing (DSP) techniques fail to enhance noisy audio clips,
we also worked on the classification of different enhanced Industrial sounds and compared
the results with not enhanced Industrial audio.
For sound enhancement, we used the magnitude spectrum of audio. Considering the
temporal and spatial features we investigated four different deep learning architectures on
speech datasets to select the most suitable architecture for the enhancement of Industrial
sounds. The architectures consisted of Feed Forward Neural Network, Convolution Neural
Network, Recurrent Neural Network. We trained the models using noisy clean training pairs.
The trained model acted as a filter for background noise. To examine the enhancement
performance we measured Noise reduction, speech distortion, and perceptual estimation of
speech quality. The Experimental results show Convolution and recurrent neural network
layers increased the performance of the models.
For the classification of audio clips, we used Mel spectrogram features of audio clips.
In this problem, we investigated different deep learning architectures. Here we use Full
convolution neural networks for classification and also used transfer learning to implement
ResNet50 and efficient net for classification. To measure the model performance we used
Precision, Recall F1-Score as metrics. The experiment results showed that most of the
architecture did not give good results as compared to not enhanced Audio. |
en_US |