TY - GEN
T1 - Active Object Perceiver
T2 - 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018
AU - Ye, Xin
AU - Lin, Zhe
AU - Li, Haoxiang
AU - Zheng, Shibin
AU - Yang, Yezhou
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/27
Y1 - 2018/12/27
N2 - We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs. While scene-driven visual navigation has been widely studied, prior efforts on learning navigation policies for robots to find objects are limited. The problem is often more challenging than target scene finding as the target objects can be very small in the view and can be in an arbitrary pose. We approach the problem from an active perceiver perspective, and propose a novel framework that integrates a deep neural network based object recognition module and a deep reinforcement learning based action prediction mechanism. To validate our method, we conduct experiments on both a simulation dataset (AI2-THOR)and a real-world environment with a physical robot. We further propose a new decaying reward function to learn the control policy specific to the object searching task. Experimental results validate the efficacy of our method, which outperforms competing methods in both average trajectory length and success rate.
AB - We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs. While scene-driven visual navigation has been widely studied, prior efforts on learning navigation policies for robots to find objects are limited. The problem is often more challenging than target scene finding as the target objects can be very small in the view and can be in an arbitrary pose. We approach the problem from an active perceiver perspective, and propose a novel framework that integrates a deep neural network based object recognition module and a deep reinforcement learning based action prediction mechanism. To validate our method, we conduct experiments on both a simulation dataset (AI2-THOR)and a real-world environment with a physical robot. We further propose a new decaying reward function to learn the control policy specific to the object searching task. Experimental results validate the efficacy of our method, which outperforms competing methods in both average trajectory length and success rate.
UR - http://www.scopus.com/inward/record.url?scp=85062969692&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062969692&partnerID=8YFLogxK
U2 - 10.1109/IROS.2018.8593720
DO - 10.1109/IROS.2018.8593720
M3 - Conference contribution
AN - SCOPUS:85062969692
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 6857
EP - 6863
BT - 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 October 2018 through 5 October 2018
ER -