All Issue

2021 Vol.40, Issue 5 Preview Page

Research Article

30 September 2021. pp. 479-487
Abstract
References
1
T. Virtanen, M. D. Plumbley, and D. Ellis, Computational Analysis of Sound Scenes and Events (Springer, Heidelberg, 2018), Chap. 1. 10.1007/978-3-319-63450-0
2
J. P. Bello, C. Silva, O. Nov, R. L. Dubois, A. Arora, J. Salamon, C. Mydlarz, and H. Doraiswamy, "SONYC: A system for monitoring, analyzing and mitigating urban noise pollution," Commun. ACM. 62, 68-77 (2019). 10.1145/3224204
3
K. Drossos, S. Adavanne, and T. Virtanen, "Automated audio captioning with recurrent neural networks," Proc. IEEE WASPAA. 374-378 (2017). 10.1109/WASPAA.2017.8170058
4
Y. Zigel, D. Litvak, and I. Gannot, "A method for automatic fall detection of elderly people using floor vibrations and sound -Proof of concept on human mimicking doll falls," IEEE Trans. Biomed. Eng. 56, 2858-2867 (2009). 10.1109/TBME.2009.203017119709955
5
A. Temko and C. Nadeu, "Acoustic event detection in meeting-room environments," Pattern Recognit. Lett. 30, 1281-1288 (2009). 10.1016/j.patrec.2009.06.009
6
A. Mesaros, T. Heittola, A. Eronen, and T. Virtanen, "Acoustic event detection in real life recordings," Proc. EUSIPCO. 1267-1271 (2010).
7
T. Heittola, A. Mesaros, A. Eronen, and T. Virtanen, "Context-dependent sound event detection," EURASIP J. Audio, Speech, and Music Process. 2013, 1-13 (2013). 10.1186/1687-4722-2013-1
8
E. Cakir, T. Heittola, H. Huttunen, and T. Virtanen, "Polyphonic sound event detection using multi label deep neural networks," Proc. IJCNN. 1-7 (2015). 10.1109/IJCNN.2015.728062424356781
9
H. Zhang, I. McLoughlin, and Y. Song, "Robust sound event recognition using convolutional neural networks," Proc. IEEE ICASSP. 559-563 (2015). 10.1109/ICASSP.2015.7178031
10
H. Phan, L. Hertel, M. Maass, and A. Mertins, "Robust audio event recognition with 1-max pooling convolutional neural networks," Proc. Interspeech, 3653-3657 (2016). 10.21437/Interspeech.2016-123
11
G. Parascandolo, H. Huttunen, and T. Virtanen, "Recurrent neural networks for polyphonic sound event detection in real life recordings," Proc. IEEE ICASSP. 6440-6444 (2016). 10.1109/ICASSP.2016.7472917
12
E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen, and T. Virtanen, "Convolutional recurrent neural networks for polyphonic sound event detection," IEEE/ACM Trans. on Audio, Speech, Lang. Process. 25, 1291-1303 (2017). 10.1109/TASLP.2017.2690575
13
S. Adavanne, P. Pertila, and T. Virtanen, "Sound event detection using spatial features and convolutional recurrent neural network," Proc. IEEE ICASSP. 771- 775 (2017). 10.1109/ICASSP.2017.7952260
14
N. Turpault, R. Serizel, A. Shah, and J. Salamon, "Sound event detection in domestic environments with weakly labeled data and soundscape synthesis," Proc. Workshop on DCASE. 253-257 (2019). 10.33682/006b-jx26
15
N. Turpault, R. Serizel, S. Wisdom, H. Erdogan, J. R. Hershey, E. Fonseca, P. Seetharaman, and J. Salamon, "Sound event detection and separation: A benchmark on DESED synthetic soundscapes," Proc. IEEE ICASSP. 840-844 (2021). 10.1109/ICASSP39728.2021.9414789
16
D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M. D. Plumbley, "Detection and classification of acoustic scenes and events," IEEE Trans. Multimedia, 17, 1733-1746 (2015). 10.1109/TMM.2015.2428998
17
N. K. Kim and H. K. Kim, "Polyphonic sound event detection based on residual convolutional recurrent neural network with semi-supervised loss function," IEEE Access, 9, 7564-7575 (2021). 10.1109/ACCESS.2020.3048675
18
Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, "Self- training with noisy student improves ImageNet classification," Proc. IEEE/CVF CVPR. 10687-10698 (2020). 10.1109/CVPR42600.2020.01070
19
D. S. Park, W. Chan, Y. Zhang, C.-C. Chiu, B. Zoph, E. D. Cubuk, and Q. V. Le, "Specaugment: A simple data augmentation method for automatic speech recognition," arXiv preprint, arXiv:1904.08779 (2019). 10.21437/Interspeech.2019-2680
20
H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez- Paz, "Mixup: Beyond empirical risk minimization," arXiv preprint, arXiv:1710.09412 (2017).
21
L. Delphin-Poulat and C. Plapous, "Mean teacher with data augmentation for DCASE 2019 Task 4," DCASE 2019 Challenge, Tech. Rep., 2019.
22
S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolutional block attention module," Proc. ECCV. 3-19 (2018). 10.1007/978-3-030-01234-2_1
23
A. Mesaros, T. Heittola, and T. Virtanen, "Metrics for polyphonic sound event detection," Appl. Sci. 6, 162- 178 (2016). 10.3390/app6060162
24
K. Miyazaki, T. Komatsu, T. Hayashi, S. Watanabe, T. Toda, and K. Takeda "Convolution augmented transformer for semi-supervised sound event detection," Proc. Workshop on DCASE. 100-104 (2020).
Information
  • Publisher :The Acoustical Society Of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 40
  • No :5
  • Pages :479-487
  • Received Date :2021. 07. 16
  • Accepted Date : 2021. 08. 17