TY - GEN
T1 - Challenges of data collection on mturk
T2 - IISE Annual Conference and Expo 2021
AU - Mancenido, Michelle
AU - Salehi, Pouria
AU - Chiou, Erin
AU - Mosallanezhad, Ahmadreza
AU - Shah, Aksheshkumar
AU - Cohen, Myke
N1 - Funding Information:
This material is based upon work supported by the U.S. Department of Homeland Security under Grant Award Number 17STQAC00001-04-00. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the U.S. Department of Homeland Security.
Publisher Copyright:
© 2021 IISE Annual Conference and Expo 2021. All rights reserved.
PY - 2021
Y1 - 2021
N2 - During the COVID-19 pandemic, many human-subject studies have stopped in-person data collection and shifted to virtual platforms like Amazon Mechanical Turk (MTurk). This shift involves important considerations for study design and data analysis, particularly for studies involving behavioral assessment and performance with technology. We report on lessons learned from a recent study that used MTurk for a face-matching task with an open-source AI. Participants received $5 compensation for completing a 45-minute session that included questionnaires. To help address data validity issues, Qualtrics fraud-detection features (i.e., reCAPTCHA, ID-Fraud), trap-items (e.g., Respond with Often), and a modified-batch-randomization-process were employed. Participants' accumulative accuracy and response rates were also assessed. Out of 272 participants, 121 passed the data inclusion criteria. The questionnaires' reliability was within range (average 0.78) for the healthy dataset. Accumulative accuracy in the face-matching task decreased approximately halfway through the task. Subsequent data inspection revealed that almost half of the participants spent longer than 20 seconds and up to 12 minutes on a random image pair. It is possible that participants were interrupted during the study or they elected to take unscheduled breaks. Environmental factors that were easier to control during in-person laboratory studies now require built-in controls for virtual study environments. We learned that: (1) it is imperative to monitor performance measures over time for each participant; (2) the study duration may need to be kept shorter on virtual platforms compared to in-person studies; (3) an optional, planned break during the task might help prevent other unplanned breaks.
AB - During the COVID-19 pandemic, many human-subject studies have stopped in-person data collection and shifted to virtual platforms like Amazon Mechanical Turk (MTurk). This shift involves important considerations for study design and data analysis, particularly for studies involving behavioral assessment and performance with technology. We report on lessons learned from a recent study that used MTurk for a face-matching task with an open-source AI. Participants received $5 compensation for completing a 45-minute session that included questionnaires. To help address data validity issues, Qualtrics fraud-detection features (i.e., reCAPTCHA, ID-Fraud), trap-items (e.g., Respond with Often), and a modified-batch-randomization-process were employed. Participants' accumulative accuracy and response rates were also assessed. Out of 272 participants, 121 passed the data inclusion criteria. The questionnaires' reliability was within range (average 0.78) for the healthy dataset. Accumulative accuracy in the face-matching task decreased approximately halfway through the task. Subsequent data inspection revealed that almost half of the participants spent longer than 20 seconds and up to 12 minutes on a random image pair. It is possible that participants were interrupted during the study or they elected to take unscheduled breaks. Environmental factors that were easier to control during in-person laboratory studies now require built-in controls for virtual study environments. We learned that: (1) it is imperative to monitor performance measures over time for each participant; (2) the study duration may need to be kept shorter on virtual platforms compared to in-person studies; (3) an optional, planned break during the task might help prevent other unplanned breaks.
KW - Crowdsourcing
KW - Face verification
KW - Human-AI joint decision systems
UR - http://www.scopus.com/inward/record.url?scp=85120952485&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85120952485&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85120952485
T3 - IISE Annual Conference and Expo 2021
SP - 175
EP - 180
BT - IISE Annual Conference and Expo 2021
A2 - Ghate, A.
A2 - Krishnaiyer, K.
A2 - Paynabar, K.
PB - Institute of Industrial and Systems Engineers, IISE
Y2 - 22 May 2021 through 25 May 2021
ER -