2504000059
  • Open Access
  • Article
Basic theories and methods of target's height and distance measurement based on monocular vision
  • Jiafa Mao 1, *,   
  • Lu Zhang 2

Received: 23 Dec 2023 | Accepted: 28 Jun 2024 | Published: 25 Mar 2025

Abstract

The existing object tracking, localization, measurement, and other technologies mostly concentrate on dual cameras or using single camera plus the non-visual sensor technology. These technologies are achieved by increasing the amount of data at the expense of lowing the processing speed to achieve precise localization of machine vision. If machine vision localization can be achieved without increasing the amount of data processing, then only the monocular ranging method can be used. Therefore, monocular ranging is obviously more challenging in actual research. Motivated by this, this paper proposes a novel object learned method based on monocular vision. According to the geometric model of camera imaging and the basic principle of converting analog signals to digital signals, we derive the relationship model between the object distance, object height, camera height, image resolution, image target size, and camera parameters. We theoretically prove the infinite solvability of “self-invariance” and the solvability of “self-change”, which provides a theoretical basis for the object tracking, localization and measurement based on monocular vision. The experimental results show the correctness of our theory.

References 

  • 1.
    Qian, D.W.; Rahman, S.; Forbes, J.R. Relative constrained SLAM for robot navigation. In 2019 American Control Conference (ACC), Philadelphia, PA, United States, 1012 July 2019; IEEE: New York, 2019; pp. 31–36. doi: 10.23919/ACC.2019.8814592
  • 2.
    Lv, W.J.; Kang, Y.; Zhao, Y.B. FVC: A novel nonmagnetic compass. IEEE Trans. Ind. Electron., 2019, 66: 7810−7820. doi: 10.1109/TIE.2018.2884231
  • 3.
    Hachmon, G.; Mamet, N.; Sasson, S.; et al. A non-Newtonian fluid robot. Artif. Life, 2016, 22: 1−22. doi: 10.1162/ARTL_a_00194
  • 4.
    Tan, K.H. Squirrel-cage induction generator system using wavelet petri fuzzy neural network control for wind power applications. IEEE Trans. Power Electron., 2016, 31: 5242−5254. doi: 10.1109/TPEL.2015.2480407
  • 5.
    Kong, L.F.; Wu, P.L.; Li, X.S. Object depth estimation using translations of hand-eye system with uncalibrated camera. Comput. Integr. Manuf. Syst., 2009, 15: 1633−1638,1663. doi: 10.13196/j.cims.2009.08.179.konglf.027
  • 6.
    Hoang, N.B.; Kang, H.J. Neural network-based adaptive tracking control of mobile robots in the presence of wheel slip and external disturbance force. Neurocomputing, 2016, 188: 12−22. doi: 10.1016/j.neucom.2015.02.101
  • 7.
    Mendes, N.; Neto, P. Indirect adaptive fuzzy control for industrial robots: A solution for contact applications. Expert Syst. Appl., 2015, 42: 8929−8935. doi: 10.1016/j.eswa.2015.07.047
  • 8.
    Ghommam, J.; Mehrjerdi, H.; Saad, M. Robust formation control without velocity measurement of the leader robot. Control Eng. Pract., 2013, 21: 1143−1156. doi: 10.1016/j.conengprac.2013.04.004
  • 9.
    Charalampous, K.; Kostavelis, I.; Gasteratos, A. Thorough robot navigation based on SVM local planning. Robot. Auton. Syst., 2015, 70: 166−180. doi: 10.1016/j.robot.2015.02.010
  • 10.
    Jia, T.; Shi, Y.; Zhou, Z. X.; et al. 3D depth information extraction with omni-directional camera. Inf. Process. Lett., 2015, 115: 285−291. doi: 10.1016/j.ipl.2014.09.029
  • 11.
    Shirmohammadi, S.; Ferrero, A. Camera as the instrument: The rising trend of vision based measurement. IEEE Instrum. Meas. Mag., 2014, 17: 41−47. doi: 10.1109/MIM.2014.6825388
  • 12.
    Lee, C.R.; Yoon, K.J. Confidence analysis of feature points for visual-inertial odometry of urban vehicles. IET Intell. Transp. Syst., 2019, 13: 1130−1138. doi: 10.1049/iet-its.2018.5196
  • 13.
    Francisco, A. Continuous principal distance change for binocular depth perception. Image Vis. Comput., 1995, 13: 101−109. doi: 10.1016/0262-8856(95)93151-H
  • 14.
    Yang, J.C.; Xu, R.; Ding, Z. Y.; et al. 3D character recognition using binocular camera for medical assist. Neurocomputing, 2017, 220: 17−22. doi: 10.1016/j.neucom.2016.01.122
  • 15.
    Xu, Y.; Guo, D.X.; Zheng, T.X.; et al. Research on camera calibration methods of the machine vision. In 2th International Conference on Mechanic Automation and Control Engineering, Hohhot, 15–17 July 2011; IEEE: New York, 2011; pp. 5150–5153. doi: 10.1109/MACE.2011.5988241
  • 16.
    Li, J.; Allinson, N.M. A comprehensive review of current local features for computer vision. Neurocomputing, 2008, 71: 1771−1787. doi: 10.1016/j.neucom.2007.11.032
  • 17.
    Song, L.M.; Wu, W.F.; Guo, J.R.; et al. Survey on camera calibration technique. In Proceedings of Fifth International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, China, 2627 August 2013; IEEE: New York, 2013; pp. 389–392. doi: 10.1109/IHMSC.2013.240
  • 18.
    Sun, J.; Gu, H.B. Research of linear camera calibration based on planar pattern. World Acad. Sci. Eng. Technol., 2011, 60: 627−631.
  • 19.
    Abdel-Aziz, Y.I.; Karara, H.M. Direct linear transformation into object space coordinates in close-range photogrammetry. In Proceedings Symposium on Close Range Photogrammetry, Urbana, Illinois, 1971; 1971; pp. 1–18.
  • 20.
    Tsai, R. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Autom., 1987, 3: 323−344. doi: 10.1109/JRA.1987.1087109
  • 21.
    Zhang, Z.Y. Camera calibration with one-dimensional objects. IEEE Trans. Pattern Anal. Mach. Intell., 2004, 26: 892−899. doi: 10.1109/TPAMI.2004.21
  • 22.
    Zheng, F.; Tang, H.B.; Liu, Y.H. Odometry-vision-based ground vehicle motion estimation with SE(2)-constrained SE(3) poses. IEEE Trans. Cybern., 2019, 49: 2652−2663. doi: 10.1109/TCYB.2018.2831900
  • 23.
    Xu, D.; Ricci, E.; Ouyang, W.L.; et al. Monocular depth estimation using multi-scale continuous CRFs as sequential deep networks. IEEE Trans. Pattern Anal. Mach. Intell., 2019, 41: 1426−1440. doi: 10.1109/TPAMI.2018.2839602
  • 24.
    Liu, F.Y.; Shen, C.H.; Lin, G.S.; et al. Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38: 2024−2039. doi: 10.1109/TPAMI.2015.2505283
  • 25.
    Eigen, D.; Puhrsch, C.; Fergus, R. Depth map prediction from a single image using a multi-scale deep network. In Proceeding of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 813 December 2014; MIT Press: Cambridge, 2014; pp. 2366–2374.
  • 26.
    He, L.; Wang, G.H.; Hu, Z.Y. Learning depth from single images with deep neural network embedding focal length. IEEE Trans. Image Process., 2018, 27: 4676−4689. doi: 10.1109/TIP.2018.2832296
  • 27.
    Zhang, Z.Y.; Xu, C.Y.; Yang, J.; et al. Deep hierarchical guidance and regularization learning for end-to-end depth estimation. Pattern Recognit., 2018, 83: 430−442. doi: 10.1016/j.patcog.2018.05.016
  • 28.
    Loo, S.Y.; Amiri, A.J.; Mashohor, S.; et al. CNN-SVO: Improving the mapping in semi-direct visual odometry using single-image depth prediction. In International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 2024 May 2019; IEEE: New York, 2019; pp. 5218–5223. doi: 10.1109/ICRA.2019.8794425
  • 29.
    Tateno, K.; Tombari, F.; Laina, I.; et al. CNN-SLAM: Real-time dense monocular slam with learned depth prediction. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, United States, 2126 July 2017; IEEE: New York, 2017; pp. 6565–6574. doi: 10.1109/CVPR.2017.695
  • 30.
    Eigen, D.; Fergus, R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In IEEE International Conference on Computer Vision, Santiago, Chile, 713 December 2015; IEEE: New York, 2015; pp. 2650–2658. doi: 10.1109/ICCV.2015.304
  • 31.
    Wang, P.; Shen, X.H.; Lin, Z.; et al. Towards unified depth and semantic prediction from a single image. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, United States, 712 June 2015; IEEE: New York, 2015; pp. 2800–2809. doi: 10.1109/CVPR.2015.7298897
  • 32.
    Li, B.; Shen, C.H.; Dai, Y.C.; et al. Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, United States, 712 June 2015; IEEE: New York, 2015; pp. 1119–1127. doi: 10.1109/CVPR.2015.7298715
  • 33.
    Cao, Y.Z.H.; Wu, Z.F.; Shen, C.H. Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Trans. Circuits Syst. Video Technol., 2018, 28: 3174−3182. doi: 10.1109/TCSVT.2017.2740321
  • 34.
    Zheng, K.C.; Zha, Z.J.; Cao, Y.; et al. LA-Net: Layout-aware dense network for monocular depth estimation. In 26th ACM Multimedia Conference, Seoul, Korea, 2226 October 2018; ACM: New York, 2018; pp. 1381–1388. doi: 10.1145/3240508.3240628
  • 35.
    Mao, J.F.; Xiao, G.; Sheng, W.G.; et al. Research on realizing the 3D occlusion tracking location method of fish’s school target. Neurocomputing, 2016, 214: 61−79. doi: 10.1016/j.neucom.2016.05.067
  • 36.
    Hemelrijk, C.K.; Hildenbrandt, H.; Reinders, J.; et al. Emergence of oblong school shape: Models and empirical data of fish. Ethology, 2010, 116: 1099−1112. doi: 10.1111/j.1439-0310.2010.01818.x
  • 37.
    Yao, J.L.; Yan, H.M.; Zhang, X.D.; et al. Image registration and superposition for improving ranging accuracy of imaging laser radar. Chin. J. Lasers, 2010, 37: 1613−1617. doi: 10.3788/CJL20103706.1613
  • 38.
    Cai, H.; Hu, Z.Z.; Huang, G.; et al. Integration of GPS, monocular vision, and high definition (HD) map for accurate vehicle localization. Sensors, 2018, 18: 3270. doi: 10.3390/s18103270
  • 39.
    Mao, J.F.; Huang, W.; Sheng, W.G. Target distance Measurement method using monocular vision. IET Image Process., 2020, 14: 3181−3187. doi: 10.1049/iet-ipr.2019.1293
  • 40.
    Liu, B.D. Uncertainty Theory, 4th ed.; Springer: Berlin, Heidelberg, 2015; pp. 9–26. doi: 10.1007/978-3-662-44354-5
  • 41.
    Qin, Z.F. Developments of conditional uncertain measure. In 8th International Conference on Information and Management Sciences, Kunming, China, 20–28 July 2009; 2009 ; pp. 802–806.
  • 42.
    Wang, X.F. The Study of Level Set Methods and Their Applications in Image Segmentation. Ph.D. Thesis, University of Science and Technology of China, Hefei, China, 2009 . (In Chinese).
  • 43.
    Jia, T.; Yuan, X.; Gao, T.H.Q.; et al. Depth perception based on monochromatic shape encode-decode structured light method. Opt. Lasers Eng., 2020, 134: 106259. doi: 10.1016/j.optlaseng.2020.106259
  • 44.
    Li, D.X.; Jia, T.; Wu, C.D.; et al. Multi-state objects depth acquisition based on binocular structured light. Opt. Lasers Eng., 2019, 121: 521−528. doi: 10.1016/j.optlaseng.2019.05.003
  • 45.
    Jia, T.; Wang, B.N.; Zhou, Z.X.; et al. Scene depth perception based on omnidirectional structured light. IEEE Trans. Image Process., 2016, 25: 4369−4378. doi: 10.1109/TIP.2016.2590304
Share this article:
How to Cite
Mao, J.; Zhang, L. Basic theories and methods of target's height and distance measurement based on monocular vision. International Journal of Network Dynamics and Intelligence 2025, 4 (1), 100007. https://doi.org/10.53941/ijndi.2025.100007.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.