DEMON style neural networks front-end features for passive sonar classification

Sangmin Lee; Jaeyoung Hwang; Yoonchang Han; Donmoon Lee; Do Kyung Shin; Seung Hwan Kim; Young Dae Kim

doi:10.7776/ASK.2025.44.2.085

All Issue

2025 Vol.44, Issue 2 Next Page

Research Article

DEMON style neural networks front-end features for passive sonar classification 수동 소나 분류를 위한 DEMON 형식의 신경망 프론트엔드 특성

31 March 2025. pp. 85-93

PDF XML

Abstract

This study proposes a novel neural network front-end feature based on conventional sonar signal processing. It simplifies the extraction of Detection Envelope Modulation On Noise (DEMON)gram, a method used in passive sonar signal processing, by implementing it with two consecutive Short-Time Fourier Transform (STFT) operations. This converts the 1-dimensional sonar signal into a 2-dimensional feature that can effectively capture the frequency modulation characteristics of cavitation generated by propellers. This DEMONgram-based frontend feature, when combined with conventional Mel spectrogram-based features in audio classification, can demonstrate higher performance. Experimental results on the ShipsEar dataset show that the proposed method achieves an accuracy of 81.0 %, a 5.8 % point improvement over the conventional Mel spectrogram-based features, thus demonstrating its effectiveness in passive sonar signal classification tasks.

Keywords

Passive sonar

Neural networks

Detection Envelope Modulation On Noise (DEMON) analysis

Vessel classification

본 연구에서는 기존의 소나 신호처리를 기반으로 한 새로운 신경망의 프론트엔드 특성을 제안하였다. 이는 수동 소나 신호처리 방법 중 하나인 Detection Envelope Modulation On Noise(DEMON)gram을 추출하는 방법을 단순화한 것으로 연속적인 두 번의 Short-Time Fourier Transform(STFT) 연산으로 구현되었다. 이를 통해서 1차원의 소나 신호를 프로펠러에서 발생하는 공동현상의 주파수 변조 특성을 효과적으로 포착할 수 있는 2차원 특성으로 변화시킨다. 이러한 DEMONgram 기반의 프론트엔드 특성은 오디오 분류에서의 일반적인 멜 스펙트로그램 기반 특성과 결합되었을 때, 보다 높은 성능을 보여줄 수 있다. ShipsEar 데이터셋에서 수행된 실험 결과, 제안 방식은 기존 멜 스펙트로그램 기반 특성 대비 5.8 %포인트 향상된 81.0 %의 정확도를 달성하며 수동 소나 신호 분류 작업에서의 그 효과성을 입증하였다.

키워드

수동소나

신경망

데몬 분석

선박 분류

References

D. J. Creasey, Remote Sensing for Environmental Sciences (Springer, Berlin, Heidelberg, 1976), pp. 277-303.

10.1007/978-3-642-66236-2_8

W. S. Burdic, Underwater Acoustic System Analysis (Prentice-Hall, New Jersey, 1984), pp. 113.

G. R. Arrabito, B. E. Cooke, and S. M. McFadden, "Recommendations for enhancing the role of the auditory modality for processing sonar data," Appl. Acoust. 66, 986-1005 (2005).

10.1016/j.apacoust.2004.11.010

D. Kobus and L. Lewandowski, "Critical factors in sonar operation: A survey of experienced operators," NHRC Tech. Rep., 1991.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Adv. Neural. Inf. Process. Syst. 26, 1097-1105 (2012).

K. He, X. Zang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proc. IEEE CVPR, 770-778 (2016).

10.1109/CVPR.2016.9026180094

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding", Proc. NAACL, 4171-4186 (2019).

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and Dario Amodei, "Language models are few-shot learners," Adv. Neural. Inf. Process. Syst. 33, 1877-1901 (2020).

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," Adv. Neural. Inf. Process. Syst. 31, 6000-6010 (2017).

A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, "WaveNet: A generative model for raw audio," arXiv preprint arXiv:1609.03499 (2016).

A. Gulati, J. Qin, C.C. Chiu, N. Parmar, Y. Zhang, J. Yu, W. Han, S. Wang, Z. Zhang, Y. Wu, and R. Pang, "Conformer: Convolution-augmented transformer for speech recognition," arXiv preprint arXiv:2005.08100 (2020).

10.21437/Interspeech.2020-3015

G. H. Ko, K. Lee, and C. H. Lee, "Passive sonar signal classification using graph neural network based on image patch" (in Korean), J. Acoust. Soc. Kr. 43, 234-242 (2024).

H. Yang, J. Li, S. Shen, and G. Xu, "A deep convolutional neural network inspired by auditory perception for underwater acoustic target recognition," Sensors, 19, 1104 (2019).

10.3390/s1905110430836716PMC6427555

R. O. Nielsen, "Cramer-Rao lower bounds for sonar broad-band modulation parameters," IEEE J. Oceanic Eng. 24, 285-290 (1999).

10.1109/48.775290

K. Choi, D. Joo, and J. Kim, "Kapre: On-gpu audio preprocessing layers for a quick implementation of deep neural network models with keras," arXiv preprint arXiv:1706.05781 (2017).

D. Santos-Domínguez, S. Torres-Guijarro, A. Cardenal-López, and A. Pena-Gimenez, "ShipsEar: An underwater vessel noise database," Appl. Acoust. 113, 64-69 (2016).

10.1016/j.apacoust.2016.06.008

J. Xu, Y. Xie, and W. Wang, "Underwater acoustic target recognition based on smoothness-inducing regularization and spectrogram-based data augmentation," Ocean Eng. 281, 114926 (2023).

10.1016/j.oceaneng.2023.114926

K. He, X. Zhang, S. Ren, and J. Sun, "Identity mappings in deep residual networks," Proc. 14th European Conference, 630-645 (2016).

10.1007/978-3-319-46493-0_38

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," Proc. IEEE CVPR, 4510-4520 (2018).

10.1109/CVPR.2018.00474

D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980 (2014).

K. Palanisamy, D. Singhania, and A. Yao, "Rethinking CNN models for audio classification," arXiv preprint arXiv:2007.11154 (2020).

Information

Publisher :The Acoustical Society of Korea
Publisher(Ko) :한국음향학회
Journal Title :The Journal of the Acoustical Society of Korea
Journal Title(Ko) :한국음향학회지
Volume : 44
No :2
Pages :85-93
Received Date : 2024-11-20
Revised Date : 2025-01-03
Accepted Date : 2025-02-17
DOI :https://doi.org/10.7776/ASK.2025.44.2.085

The Journal of the Acoustical Society of KoreaISSN:1225-4428(Print) 2287-3775(Online)한국음향학회

All Issue