TY - GEN
T1 - Learning human search strategies from a crowdsourcing game
AU - Sexton, Thurston
AU - Ren, Max Yi
N1 - Funding Information:
This work has been supported by the National Science Foundation under Grant No. CMMI-1266184 and the start-up funding from Arizona State University. These supports are gratefully acknowledged.
Publisher Copyright:
© Copyright 2016 by ASME.
PY - 2016
Y1 - 2016
N2 - There is evidence that humans can be more efficient than existing algorithms at searching for good solutions in highdimensional and non-convex design or control spaces, potentially due to our prior knowledge and learning capability. This work attempts to quantify the search strategy of human beings to enhance a Bayesian optimization (BO) algorithm for an optimal design and control problem. We consider the sequence of human solutions as generated from BO, and propose to recover the algorithmic parameters of BO by maximizing the likelihood of the observed solution path. The method is different from inverse reinforcement learning (where an optimal control solution is learned based on human demonstrations) in that the latter requires near-optimal solutions from humans, while we only require the existence of a good search strategy. The method is first verified through simulation studies and then applied to the human solutions crowdsourced through a gamification of the problem under study [1]. We learn BO parameters from a player with a demonstrated good search strategy and show that applying the BO algorithm with these parameters to the game noticeably improves the convergence of the search from using a default BO setting.
AB - There is evidence that humans can be more efficient than existing algorithms at searching for good solutions in highdimensional and non-convex design or control spaces, potentially due to our prior knowledge and learning capability. This work attempts to quantify the search strategy of human beings to enhance a Bayesian optimization (BO) algorithm for an optimal design and control problem. We consider the sequence of human solutions as generated from BO, and propose to recover the algorithmic parameters of BO by maximizing the likelihood of the observed solution path. The method is different from inverse reinforcement learning (where an optimal control solution is learned based on human demonstrations) in that the latter requires near-optimal solutions from humans, while we only require the existence of a good search strategy. The method is first verified through simulation studies and then applied to the human solutions crowdsourced through a gamification of the problem under study [1]. We learn BO parameters from a player with a demonstrated good search strategy and show that applying the BO algorithm with these parameters to the game noticeably improves the convergence of the search from using a default BO setting.
UR - http://www.scopus.com/inward/record.url?scp=85008234786&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85008234786&partnerID=8YFLogxK
U2 - 10.1115/DETC2016-59775
DO - 10.1115/DETC2016-59775
M3 - Conference contribution
AN - SCOPUS:85008234786
T3 - Proceedings of the ASME Design Engineering Technical Conference
BT - 42nd Design Automation Conference
PB - American Society of Mechanical Engineers (ASME)
T2 - ASME 2016 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC/CIE 2016
Y2 - 21 August 2016 through 24 August 2016
ER -