Underwater Detection: A Brief Survey and a New Multitask Dataset

Yu Wei; Yi Wang; Baofeng Zhu; Chi Lin; Dan Wu; Xinwei Xue; Ruili Wang

doi:10.53941/ijndi.2024.100025

Abstract

Underwater detection poses significant challenges due to the unique characteristics of the underwater environment, such as light attenuation, scattering, water turbidity, and the presence of small or camouflaged objects. To gain a clearer understanding of these challenges, we first review two common detection tasks: object detection (OD) and salient object detection (SOD). Next, we examine the difficulties of adapting existing OD and SOD techniques to underwater settings. Additionally, we introduce a new Underwater Object Multitask (UOMT) dataset, complete with benchmarks. This survey, along with the proposed dataset, aims to provide valuable resources to researchers and practitioners to develop more effective techniques to address the challenges of underwater detection. The UOMT dataset and benchmarks are available at https://github.com/yiwangtz/UOMT.

References

1.
B. J. Boom, J. He, S. Palazzo, P. X. Huang, C. Beyan, H.-M. Chou, F.-P. Lin, C. Spampinato, and R. B. Fisher. A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage. Ecological Informatics, 2014, 23: 83−97. doi: 10.1016/j.ecoinf.2013.10.006
2.
O. A. Aguirre-Castro, E. Inzunza-González, E. E. García-Guerrero, E. Tlelo-Cuautle, O. R. López-Bonilla, J. E. Olguín-Tiznado, and J. R. Cárdenas-Valdez. Design and construction of an rov for underwater exploration. Sensors, 2019, 19(24): 5387. doi: 10.3390/s19245387
3.
Z. Chen, R. Wang, W. Ji, M. Zong, T. Fan, and H. Wang. A novel monocular calibration method for underwater vision measurement. Multimedia Tools and Applications, 2019, 78: 19437−19455. doi: 10.1007/s11042-018-7105-z
4.
S. Fayaz, S. A. Parah, and G. Qureshi. Underwater object detection: architectures and algorithms–a comprehensive review. Multimedia Tools and Applications, 2022, 81(15): 20871−20916. doi: 10.1007/s11042-022-12502-1
5.
L. Jiao, F. Zhang, F. Liu, S. Yang, L. Li, Z. Feng, and R. Qu. A survey of deep learning-based object detection. IEEE access, 2019, 7: 128837−128868. doi: 10.1109/ACCESS.2019.2939201
6.
X. Wu, D. Sahoo, and S. C. Hoi. Recent advances in deep learning for object detection. Neurocomputing, 2020, 396: 39−64. doi: 10.1016/j.neucom.2020.01.085
7.
K. Li, G. Wan, G. Cheng, L. Meng, and J. Han. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS journal of photogrammetry and remote sensing, 2020, 159: 296−307. doi: 10.1016/j.isprsjprs.2019.11.023
8.
A. Borji, M.-M. Cheng, Q. Hou, H. Jiang, and J. Li. Salient object detection: A survey. Computational Visual Media, 2019, 5(2): 117−150. doi: 10.1007/s41095-019-0149-9
9.
A. K. Gupta, A. Seal, M. Prasad, and P. Khanna. Salient object detection techniques in computer vision—a survey. Entropy, 2020, 22(1174): 1−49. doi: 10.3390/e22101174
10.
W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, and R. Yang. Salient object detection in the deep learning era: An in-depth survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(6): 3239−325. doi: 10.1109/TPAMI.2021.3051099
11.
M. J. Nadenau, S. Winkler, D. Alleysson, and M. Kunt. Human vision models for perceptually optimized image processing—a review. Proc. IEEE, 2000, 32: 1−16
12.
R. Padilla, S. L. Netto, and E. A. Da Silva, “A survey on performance metrics for object-detection algorithms,” in 2020 international conference on systems, signals and image processing (IWSSIP), pp. 237–242, IEEE, 2020.
13.
G. Cheng, X. Yuan, X. Yao, K. Yan, Q. Zeng, X. Xie, and J. Han. Towards large-scale small object detection: Survey and benchmarks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(11): 13467−13488
14.
Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 2023, 111(3): 257−276. doi: 10.1109/JPROC.2023.3238524
15.
M. Jian, X. Liu, H. Luo, X. Lu, H. Yu, and J. Dong. Underwater image processing and analysis: A review. Signal Processing: Image Communication, 2021, 91: 116088. doi: 10.1016/j.image.2020.116088
16.
T. Xu, W. Zhao, L. Cai, H. Chai, and J. Zhou, “An underwater saliency detection method based on grayscale image information fusion,” in 2022 International Conference on Advanced Robotics and Mechatronics (ICARM), pp. 255–260, IEEE, 2022.
17.
M. Reggiannini and D. Moroni. The use of saliency in underwater computer vision: A review. Remote Sensing, 2021, 13(1): 22
18.
M. Zong, R. Wang, X. Chen, Z. Chen, and Y. Gong. Motion saliency based multi-stream multiplier resnets for action recognition. Image and Vision Computing, 2021, 107: 104108. doi: 10.1016/j.imavis.2021.104108
19.
C. Jing, J. Potgieter, F. Noble, and R. Wang, “A comparison and analysis of rgb-d cameras’ depth performance for robotics application,” in 2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), pp. 1–6, IEEE, 2017.
20.
K. Fu, Y. Jiang, G.-P. Ji, T. Zhou, Q. Zhao, and D.-P. Fan. Light field salient object detection: A review and benchmark. Computational Visual Media, 2022, 8(4): 509−534. doi: 10.1007/s41095-021-0256-2
21.
H. Zhou, Y. Lin, L. Yang, J. Lai, and X. Xie, “Benchmarking deep models for salient object detection,” arXiv preprint arXiv: 2202.02925, 2022.
22.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1989, 1(4): 541−551. doi: 10.1162/neco.1989.1.4.541
23.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017, 30:
24.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and L. Zitnick, “Microsoft coco: Common objects in context,” in ECCV, European Conference on Computer Vision (ECCV), September 2014.
25.
S. S. A. Zaidi, M. S. Ansari, A. Aslam, N. Kanwal, M. Asghar, and B. Lee. A survey of modern deep learning-based object detection models. Digital Signal Processing, 2022103514
26.
R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, pp. 1440–1448, 2015.
27.
S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031
28.
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988, 2017.
29.
ultralytics, “Yolov8,” 2023. https://github.com/ultralytics/ultralytics, Last accessed on 2023-06-24.
30.
J. Wang, L. Song, Z. Li, H. Sun, J. Sun, and N. Zheng, “End-to-end object detection with fully convolutional network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15849–15858, 2021.
31.
M. Tan, R. Pang, and Q. V. Le, “Efficientdet: Scalable and efficient object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10781–10790, 2020.
32.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37, Springer, 2016.
33.
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587, 2014.
34.
K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904−1916. doi: 10.1109/TPAMI.2015.2389824
35.
J. Dai, Y. Li, K. He, and J. Sun. R-fcn: Object detection via region-based fully convolutional networks. Advances in neural information processing systems, 2016, 29:
36.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Los Alamitos, CA, USA), pp. 779–788, IEEE Computer Society, Jun 2016.
37.
J. Redmon and A. Farhadi, “Yolo9000: better, faster, stronger,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271, 2017.
38.
S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, “Single-shot refinement neural network for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4203–4212, 2018.
39.
J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” ArXiv, vol. abs/1804.02767, 2018.
40.
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv: 2004.10934, 2020.
41.
G. Jocher, A. Stoken, J. Borovec, A. Chaurasia, L. Changyu, A. Hogan, J. Hajek, L. Diaconu, Y. Kwon, Y. Defretin, et al., “ultralytics/yolov5: v5. 0-yolov5-p6 1280 models, aws, supervise. ly and youtube integrations,” Zenodo, 2021.
42.
Q. Chen, Y. Wang, T. Yang, X. Zhang, J. Cheng, and J. Sun, “You only look one-level feature,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13039–13048, 2021.
43.
X. Huang, X. Wang, W. Lv, X. Bai, X. Long, K. Deng, Q. Dang, S. Han, Q. Liu, X. Hu, et al., “Pp-yolov2: A practical object detector,” arXiv preprint arXiv: 2104.10419, 2021.
44.
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable detr: Deformable transformers for end-to-end object detection,” arXiv preprint arXiv: 2010.04159, 2020.
45.
C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464–7475, 2023.
46.
H. Law and J. Deng. Cornernet: Detecting objects as paired keypoints. International Journal of Computer Vision, 2020, 128(3, SI): 642−656. doi: 10.1007/s11263-019-01204-1
47.
E. H. Nguyen, H. Yang, R. Deng, Y. Lu, Z. Zhu, J. T. Roland, L. Lu, B. A. Landman, A. B. Fogo, and Y. Huo. Circle representation for medical object detection. IEEE Transactions on Medical Imaging, 2022, 41(3): 746−754. doi: 10.1109/TMI.2021.3122835
48.
Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” ArXiv, vol. abs/2107.08430, 2021.
49.
C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, et al., “Yolov6: A single-stage object detection framework for industrial applications,” arXiv preprint arXiv: 2209.02976, 2022.
50.
S. Xu, X. Wang, W. Lv, Q. Chang, C. Cui, K. Deng, G. Wang, Q. Dang, S. Wei, Y. Du, and B. Lai, “Pp-yoloe: An evolved version of yolo,” 2022.
51.
S. Liu, F. Li, H. Zhang, X. Yang, X. Qi, H. Su, J. Zhu, and L. Zhang, “DAB-DETR: dynamic anchor boxes are better queries for DETR,” CoRR, vol. abs/2201.12329, 2022.
52.
H. Zhang, F. Li, S. Liu, L. Zhang, H. Su, J. Zhu, L. M. Ni, and H.-Y. Shum, “Dino: Detr with improved denoising anchor boxes for end-to-end object detection,” arXiv preprint arXiv: 2203.03605, 2022.
53.
F. Li, H. Zhang, H. Xu, S. Liu, L. Zhang, L. M. Ni, and H.-Y. Shum, “Mask dino: Towards a unified transformer-based framework for object detection and segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3041–3050, June 2023.
54.
S. Liu, Z. Zeng, T. Ren, F. Li, H. Zhang, J. Yang, C. Li, J. Yang, H. Su, J. Zhu, and L. Zhang, “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” 2024.
55.
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision (ECCV), pp. 213–229, Springer, 2020.
56.
Y. Fang, B. Liao, X. Wang, J. Fang, J. Qi, R. Wu, J. Niu, and W. Liu. You only look at one sequence: Rethinking transformer in vision through object detection. Advances in Neural Information Processing Systems, 2021, 34: 26183−26197
57.
C.-Y. Wang, I.-H. Yeh, and H.-Y. M. Liao, “You only learn one representation: Unified network for multiple tasks,” arXiv preprint arXiv: 2105.04206, 2021.
58.
X. Zhou, R. Girdhar, A. Joulin, P. Krähenbühl, and I. Misra, “Detecting twenty-thousand classes using image-level supervision,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part IX, pp. 350–368, Springer, 2022.
59.
F. Li, H. Zhang, S. Liu, J. Guo, L. M. Ni, and L. Zhang, “Dndetr: Accelerate detr training by introducing query denoising,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13619–13627, 2022.
60.
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results.” http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
61.
G. Ghiasi, T.-Y. Lin, and Q. V. Le, “Nas-fpn: Learning scalable feature pyramid architecture for object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7036–7045, 2019.
62.
K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “Centernet: Keypoint triplets for object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 6569–6578, 2019.
63.
S. He, R. W. Lau, and Q. Yang, “Exemplar-driven top-down saliency detection via deep association,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5723–5732, 2016.
64.
G. Lee, Y.-W. Tai, and J. Kim, “Deep saliency with encoded low level distance map and high level features,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 660–668, 2016.
65.
L. Wang, H. Lu, Y. Wang, M. Feng, D. Wang, B. Yin, and X. Ruan, “Learning to detect salient objects with image-level supervision,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3796–3805, 2017.
66.
X. Zhang, T. Wang, J. Qi, H. Lu, and G. Wang, “Progressive attention guided recurrent network for salient object detection,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 714–722, 2018.
67.
Z. Wu, L. Su, and Q. Huang, “Cascaded partial decoder for fast and accurate salient object detection,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3902–3911, 2019.
68.
J.-J. Liu, Q. Hou, M.-M. Cheng, J. Feng, and J. Jiang, “A simple pooling-based design for real-time salient object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3917–3926, 2019.
69.
X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan, and M. Jagersand, “Basnet: Boundary-aware salient object detection,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7471–7481, 2019.
70.
J. Zhao, J.-J. Liu, D.-P. Fan, Y. Cao, J. Yang, and M.-M. Cheng, “Egnet: Edge guidance network for salient object detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8778–8787, 2019.
71.
J. Wei, S. Wang, Z. Wu, C. Su, Q. Huang, and Q. Tian, “Label decoupling framework for salient object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13025–13034, 2020.
72.
Y. Pang, X. Zhao, L. Zhang, and H. Lu, “Multi-scale interactive network for salient object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9413–9422, 2020.
73.
J. Zhang, D.-P. Fan, Y. Dai, S. Anwar, F. S. Saleh, T. Zhang, and N. Barnes, “Uc-net: Uncertainty inspired rgb-d saliency detection via conditional variational autoencoders,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8582–8591, 2020.
74.
X. Hu, C.-W. Fu, L. Zhu, T. Wang, and P.-A. Heng. Sac-net: Spatial attenuation context for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(3): 1079−1090
75.
B. Xu, H. Liang, R. Liang, and P. Chen, “Locate globally, segment locally: A progressive architecture with knowledge review network for salient object detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3004–3012, 2021.
76.
L. Tang, B. Li, Y. Zhong, S. Ding, and M. Song, “Disentangled high quality salient object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3580–3590, 2021.
77.
Z. Wu, L. Su, and Q. Huang. Decomposition and completion network for salient object detection. IEEE Transactions on Image Processing, 2021, 30: 6226−6239. doi: 10.1109/TIP.2021.3093380
78.
Y.-H. Wu, Y. Liu, L. Zhang, M.-M. Cheng, and B. Ren. Edn: Salient object detection via extremely-downsampled network. IEEE Transactions on Image Processing, 2022, 31: 3125−3136. doi: 10.1109/TIP.2022.3164550
79.
R. Cong, K. Zhang, C. Zhang, F. Zheng, Y. Zhao, Q. Huang, and S. Kwong. Does thermal really always matter for rgb-t salient object detection?. IEEE Transactions on Multimedia, 2022, 25: 6971−6982
80.
M. S. Lee, W. Shin, and S. W. Han. Tracer: Extreme attention guided salient object tracing network (student abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36: 12993−12994. doi: 10.1609/aaai.v36i11.21633
81.
Y. Wang, R. Wang, X. Fan, T. Wang, and X. He, “Pixels, regions, and objects: Multiple enhancement for salient object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10031–10040, 2023.
82.
Z. Liu, Y. Tan, Q. He, and Y. Xiao. Swinnet: Swin transformer drives edge-aware rgb-d and rgb-t salient object detection. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4486−4497. doi: 10.1109/TCSVT.2021.3127149
83.
J. Zhang, J. Xie, N. Barnes, and P. Li. Learning generative vision transformer with energy-based latent space for saliency prediction. Advances in Neural Information Processing Systems, 2021, 34: 15448−15463
84.
Y. K. Yun and W. Lin, “Selfreformer: Self-refined network with transformer for salient object detection,” arXiv preprint arXiv: 2205.11283, 2022.
85.
C. Xie, C. Xia, M. Ma, Z. Zhao, X. Chen, and J. Li, “Pyramid grafting network for one-stage high resolution saliency detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11717–11726, 2022.
86.
M. Zhuge, D.-P. Fan, N. Liu, D. Zhang, D. Xu, and L. Shao. Salient object detection via integrity learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3738−3772
87.
L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254−1259. doi: 10.1109/34.730558
88.
T. Zhou, D.-P. Fan, M.-M. Cheng, J. Shen, and L. Shao. Rgb-d salient object detection: A survey. Comput. Vis. Media, 2021, 7(1): 37−69. doi: 10.1007/s41095-020-0199-z
89.
R. Zhao, W. Ouyang, H. Li, and X. Wang, “Saliency detection by multi-context deep learning,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1265–1274, 2015.
90.
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440, 2015.
91.
N. Liu and J. Han, “Dhsnet: Deep hierarchical saliency network for salient object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 678–686, 2016.
92.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” pp. 1–14, Computational and Biological Learning Society, 2015.
93.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
94.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv: 2010.11929, 2020.
95.
N. Liu, N. Zhang, K. Wan, L. Shao, and J. Han, “Visual saliency transformer,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4722–4732, October 2021.
96.
W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media, 2022, 8(3): 415−424. doi: 10.1007/s41095-022-0274-8
97.
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021.
98.
A. Saini and M. Biswas, “Object detection in underwater image by detecting edges using adaptive thresholding,” in 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), pp. 628–632, IEEE, 2019.
99.
F. Han, J. Yao, H. Zhu, and C. Wang. Underwater image processing and object detection based on deep cnn method. Journal of Sensors, 2020, 2020(1): 6707328
100.
Z. Liu, Y. Zhuang, P. Jia, C. Wu, H. Xu, and Z. Liu. A novel underwater image enhancement algorithm and an improved underwater biological detection pipeline. Journal of Marine Science and Engineering, 2022, 10(9): 1204. doi: 10.3390/jmse10091204
101.
P. Athira., T. Mithun Haridas, and M. Supriya, “Underwater object detection model based on yolov3 architecture using deep neural networks,” in 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), vol. 1, pp. 40–45, 2021.
102.
C. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, and D. Tao. An underwater image enhancement benchmark dataset and beyond. IEEE Transactions on Image Processing, 2020, 29: 4376−4389. doi: 10.1109/TIP.2019.2955241
103.
X. Li, F. Li, J. Yu, and G. An, “A high-precision underwater object detection based on joint self-supervised deblurring and improved spatial transformer network,” arXiv preprint arXiv: 2203.04822, 2022.
104.
L. Chen, Z. Jiang, L. Tong, Z. Liu, A. Zhao, Q. Zhang, J. Dong, and H. Zhou. Perceptual underwater image enhancement with deep learning and physical priors. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(8): 3078−3092
105.
L. Jiang, Y. Wang, Q. Jia, S. Xu, Y. Liu, X. Fan, H. Li, R. Liu, X. Xue, and R. Wang. Underwater species detection using channel sharpening attention. Proceedings of the 29th ACM International Conference on Multimedia, 20214259−4267
106.
C. Yeh, C. Lin, L. Kang, C. Huang, M. Lin, C. Chang, and C. Wang. Lightweight deep neural network for joint learning of underwater object detection and color conversion. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33: 6129−6143
107.
T.-S. Pan, H.-C. Huang, J.-C. Lee, and C.-H. Chen. Multi-scale resnet for real-time underwater object detection. Signal, Image and Video Processing, 2021, 15: 941−949. doi: 10.1007/s11760-020-01818-w
108.
K. Hu, F. Lu, M. Lu, Z. Deng, and Y. Liu. A marine object detection algorithm based on ssd and feature enhancement. Complexity, 2020, 2020: 1−14
109.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520, 2018.
110.
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” 2017.
111.
W. Hao and N. Xiao, “Research on underwater object detection based on improved yolov4,” in 2021 8th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS), pp. 166–171, IEEE, 2021.
112.
Y. Yu, J. Zhao, Q. Gong, C. Huang, G. Zheng, and J. Ma. Real-time underwater maritime object detection in side-scan sonar images based on transformer-yolov5. Remote Sensing, 2021, 13(18): 3555. doi: 10.3390/rs13183555
113.
R. B. Fisher, Y.-H. Chen-Burger, D. Giordano, L. Hardman, F.-P. Lin, et al., Fish4Knowledge: collecting and analyzing massive coral reef fish video data, vol. 104. Springer, 2016.
114.
L. Chen, Z. Liu, L. Tong, Z. Jiang, S. Wang, J. Dong, and H. Zhou, “Underwater object detection using invert multi-class adaboost with deep learning,” in 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, IEEE, 2020.
115.
B. Fan, W. Chen, Y. Cong, and J. Tian, “Dual refinement underwater object detection network,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pp. 275–291, Springer, 2020.
116.
C. Liu, Z. Wang, S. Wang, T. Tang, Y. Tao, C. Yang, H. Li, X. Liu, and X. Fan. A new dataset, poisson gan and aquanet for underwater object grabbing. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 32(5): 2831−2844
117.
C. Liu, H. Li, S. Wang, M. Zhu, D. Wang, X. Fan, and Z. Wang, “A dataset and benchmark of underwater object detection for robot picking,” in 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6, IEEE, 2021.
118.
R. Liu, X. Fan, M. Zhu, M. Hou, and Z. Luo. Real-world underwater enhancement: Challenges, benchmarks, and solutions under natural light. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(12): 4861−4875. doi: 10.1109/TCSVT.2019.2963772
119.
L. Hong, X. Wang, G. Zhang, and M. Zhao. Usod10k: a new benchmark dataset for underwater salient object detection. IEEE transactions on image processing, 2023, 1−1.
120.
M. Pedersen, J. Bruslund Haurum, R. Gade, and T. B. Moeslund, “Detection of marine animals in a new underwater dataset with varying visibility,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 18–26, 2019.
121.
M. Fulton, J. Hong, M. J. Islam, and J. Sattar, “Robotic detection of marine litter using deep visual detection models,” in 2019 international conference on robotics and automation (ICRA), pp. 5752–5758, IEEE, 2019.
122.
M. Jian, Q. Qi, H. Yu, J. Dong, C. Cui, X. Nie, H. Zhang, Y. Yin, and K.-M. Lam. The extended marine underwater environment database and baseline evaluations. Applied Soft Computing, 2019, 80: 425−437. doi: 10.1016/j.asoc.2019.04.025
123.
J. Hong, M. Fulton, and J. Sattar, “Trashcan: A semantically-segmented dataset towards visual detection of marine debris,” arXiv preprint arXiv: 2007.08097, 2020.
124.
M. J. Islam, C. Edge, Y. Xiao, P. Luo, M. Mehtaz, C. Morse, S. S. Enan, and J. Sattar, “Semantic segmentation of underwater imagery: Dataset and benchmark,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1769–1776, IEEE, 2020.
125.
M. Jian, Q. Qi, J. Dong, Y. Yin, W. Zhang, and K.-M. Lam, “The ouc-vision large-scale underwater image database,” in 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 1297–1302, IEEE, 2017.
126.
M. Islam, P. Luo, and J. Sattar, “Simultaneous enhancement and super-resolution of underwater imagery for improved visual perception,” in Robotics (M. Toussaint, A. Bicchi, and T. Hermans, eds.), Robotics: Science and Systems, MIT Press Journals, 2020.
127.
M. J. Islam, R. Wang, and J. Sattar, “Svam: Saliency-guided visual attention modeling by autonomous underwater robots,” in Robotics: Science and Systems (RSS), (NY, USA), 2022.
128.
D. M. Powers, “Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation,” arXiv preprint arXiv: 2010.16061, 2020.
129.
D. L. Olson and D. Delen, Advanced data mining techniques. Springer Science & Business Media, 2008.
130.
M. A. Rahman and Y. Wang, “Optimizing intersection-over-union in deep neural networks for image segmentation,” in Proc. Int. Symp. Vis. Comput., pp. 234–244, Springer, 2016.
131.
F. Perazzi, P. Krähenbühl, Y. Pritch, and A. Hornung, “Saliency filters: Contrast based filtering for salient region detection,” in 2012 IEEE conference on computer vision and pattern recognition, pp. 733–740, IEEE, 2012.
132.
D.-P. Fan, M.-M. Cheng, Y. Liu, T. Li, and A. Borji, “Structure-measure: A new way to evaluate foreground maps,” in Proceedings of the IEEE international conference on computer vision, pp. 4548–4557, 2017.
133.
D.-P. Fan, C. Gong, Y. Cao, B. Ren, M.-M. Cheng, and A. Borji, “Enhanced-alignment measure for binary foreground map evaluation,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 698–704, International Joint Conferences on Artificial Intelligence Organization, 7 2018.
134.
lartpang Pang, “Pysodevaltoolkit.” https://github.com/lartpang/PySODEvalToolkit, 2022.
135.
P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 2010, 33(5): 898−916
136.
R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned salient region detection,” in 2009 IEEE conference on computer vision and pattern recognition, pp. 1597–1604, IEEE, 2009.
137.
R. Margolin, L. Zelnik-Manor, and A. Tal, “How to evaluate foreground maps?,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 248–255, 2014.
138.
B. Sekachev, A. Zhavoronkov, and N. Manovich, “Computer vision annotation tool.” Website, 2019. https://github.com/opencv/cvat.
139.
Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6154–6162, 2018.
140.
X. Lu, B. Li, Y. Yue, Q. Li, and J. Yan, “Grid r-cnn,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7355–7364, 2019.
141.
Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: Fully convolutional one-stage object detection,” in IEEE/CVF International Conference on Computer Vision (ICCV 2019), pp. 9626–9635, IEEE; IEEE Comp Soc; CVF, 2019.
142.
S. Zhang, C. Chi, Y. Yao, Z. Lei, and S. Z. Li, “Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9756–9765, 2020.
143.
K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, et al., “Mmdetection: Open mmlab detection toolbox and benchmark,” arXiv preprint arXiv: 1906.07155, 2019.
144.
S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv e-prints, p. arXiv: 1609.04747, Sept. 2016.
145.
H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, “Bag of tricks and a strong baseline for deep person re-identification,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1487–1495, 2019.
146.
Z. Wu, L. Su, and Q. Huang, “Stacked cross refinement network for edge-aware salient object detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7263–7272, 2019.
147.
A. Li, J. Zhang, Y. Lyu, B. Liu, T. Zhang, and Y. Dai, “Uncertainty-aware joint salient object and camouflaged object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10071–10081, 2021.

Scilight Press

Author Information

Abstract

Keywords

References

About Scilight

Journals

Publishing Policies

Contact Us