MoNet3D: Towards accurate monocular 3d object localization in real time

Xichuan Zhou; Yicong Peng; Chunqiao Long; Fengbo Ren; Cong Shi

MoNet3D: Towards accurate monocular 3d object localization in real time

Xichuan Zhou, Yicong Peng, Chunqiao Long, Fengbo Ren, Cong Shi

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

8 Scopus citations

Abstract

Monocular multi-object detection and localization in 3D space has been proven to be a challenging task. The MoNet3D algorithm is a novel and effective framework that can predict the 3D position of each object in a monocular image and draw a 3D bounding box for each object. The MoNet3D method incorporates prior knowledge of the spatial geometric correlation of neighbouring objects into the deep neural network training process to improve the accuracy of 3D object localization. Experiments on the KITTI dataset show that the accuracy for predicting the depth and horizontal coordinates of objects in 3D space can reach 96.25% and 94.74%, respectively. Moreover, the method can realize the real-Time image processing at 27.85 FPS, showing promising potential for embedded advanced drivingassistance system applications. Our code is publicly available at https://github.com/CQUlearningsystemgroup/YicongPeng.

Original language	English (US)
Title of host publication	37th International Conference on Machine Learning, ICML 2020
Editors	Hal Daume, Aarti Singh
Publisher	International Machine Learning Society (IMLS)
Pages	11440-11449
Number of pages	10
ISBN (Electronic)	9781713821120
State	Published - 2020
Event	37th International Conference on Machine Learning, ICML 2020 - Virtual, Online Duration: Jul 13 2020 → Jul 18 2020

Publication series

Name	37th International Conference on Machine Learning, ICML 2020
Volume	PartF168147-15

Conference

Conference	37th International Conference on Machine Learning, ICML 2020
City	Virtual, Online
Period	7/13/20 → 7/18/20

ASJC Scopus subject areas

Computational Theory and Mathematics
Human-Computer Interaction
Software

Cite this

Zhou, X., Peng, Y., Long, C., Ren, F., & Shi, C. (2020). MoNet3D: Towards accurate monocular 3d object localization in real time. In H. Daume, & A. Singh (Eds.), 37th International Conference on Machine Learning, ICML 2020 (pp. 11440-11449). (37th International Conference on Machine Learning, ICML 2020; Vol. PartF168147-15). International Machine Learning Society (IMLS).

MoNet3D: Towards accurate monocular 3d object localization in real time. / Zhou, Xichuan; Peng, Yicong; Long, Chunqiao et al.
37th International Conference on Machine Learning, ICML 2020. ed. / Hal Daume; Aarti Singh. International Machine Learning Society (IMLS), 2020. p. 11440-11449 (37th International Conference on Machine Learning, ICML 2020; Vol. PartF168147-15).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Zhou, X, Peng, Y, Long, C, Ren, F & Shi, C 2020, MoNet3D: Towards accurate monocular 3d object localization in real time. in H Daume & A Singh (eds), 37th International Conference on Machine Learning, ICML 2020. 37th International Conference on Machine Learning, ICML 2020, vol. PartF168147-15, International Machine Learning Society (IMLS), pp. 11440-11449, 37th International Conference on Machine Learning, ICML 2020, Virtual, Online, 7/13/20.

@inproceedings{641aee7e7a824e6e8b41864e01ff2010,

title = "MoNet3D: Towards accurate monocular 3d object localization in real time",

abstract = "Monocular multi-object detection and localization in 3D space has been proven to be a challenging task. The MoNet3D algorithm is a novel and effective framework that can predict the 3D position of each object in a monocular image and draw a 3D bounding box for each object. The MoNet3D method incorporates prior knowledge of the spatial geometric correlation of neighbouring objects into the deep neural network training process to improve the accuracy of 3D object localization. Experiments on the KITTI dataset show that the accuracy for predicting the depth and horizontal coordinates of objects in 3D space can reach 96.25% and 94.74%, respectively. Moreover, the method can realize the real-Time image processing at 27.85 FPS, showing promising potential for embedded advanced drivingassistance system applications. Our code is publicly available at https://github.com/CQUlearningsystemgroup/YicongPeng.",

author = "Xichuan Zhou and Yicong Peng and Chunqiao Long and Fengbo Ren and Cong Shi",

note = "Funding Information: This work was supported by the National Natural Science Foundation of China under Contract 61971072. Publisher Copyright: {\textcopyright} 2020 by the Authors All rights reserved.; 37th International Conference on Machine Learning, ICML 2020 ; Conference date: 13-07-2020 Through 18-07-2020",

year = "2020",

language = "English (US)",

series = "37th International Conference on Machine Learning, ICML 2020",

publisher = "International Machine Learning Society (IMLS)",

pages = "11440--11449",

editor = "Hal Daume and Aarti Singh",

booktitle = "37th International Conference on Machine Learning, ICML 2020",

}

TY - GEN

T1 - MoNet3D

T2 - 37th International Conference on Machine Learning, ICML 2020

AU - Zhou, Xichuan

AU - Peng, Yicong

AU - Long, Chunqiao

AU - Ren, Fengbo

AU - Shi, Cong

PY - 2020

Y1 - 2020

N2 - Monocular multi-object detection and localization in 3D space has been proven to be a challenging task. The MoNet3D algorithm is a novel and effective framework that can predict the 3D position of each object in a monocular image and draw a 3D bounding box for each object. The MoNet3D method incorporates prior knowledge of the spatial geometric correlation of neighbouring objects into the deep neural network training process to improve the accuracy of 3D object localization. Experiments on the KITTI dataset show that the accuracy for predicting the depth and horizontal coordinates of objects in 3D space can reach 96.25% and 94.74%, respectively. Moreover, the method can realize the real-Time image processing at 27.85 FPS, showing promising potential for embedded advanced drivingassistance system applications. Our code is publicly available at https://github.com/CQUlearningsystemgroup/YicongPeng.

AB - Monocular multi-object detection and localization in 3D space has been proven to be a challenging task. The MoNet3D algorithm is a novel and effective framework that can predict the 3D position of each object in a monocular image and draw a 3D bounding box for each object. The MoNet3D method incorporates prior knowledge of the spatial geometric correlation of neighbouring objects into the deep neural network training process to improve the accuracy of 3D object localization. Experiments on the KITTI dataset show that the accuracy for predicting the depth and horizontal coordinates of objects in 3D space can reach 96.25% and 94.74%, respectively. Moreover, the method can realize the real-Time image processing at 27.85 FPS, showing promising potential for embedded advanced drivingassistance system applications. Our code is publicly available at https://github.com/CQUlearningsystemgroup/YicongPeng.

UR - http://www.scopus.com/inward/record.url?scp=85105335531&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85105335531&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85105335531

T3 - 37th International Conference on Machine Learning, ICML 2020

SP - 11440

EP - 11449

BT - 37th International Conference on Machine Learning, ICML 2020

A2 - Daume, Hal

A2 - Singh, Aarti

PB - International Machine Learning Society (IMLS)

Y2 - 13 July 2020 through 18 July 2020

ER -

MoNet3D: Towards accurate monocular 3d object localization in real time

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this