TY - GEN
T1 - To Train or Not to Train? How Training Affects the Diversity of Crowdsourced Data
AU - Ogunseye, Shawn
AU - Parsons, Jeffrey
AU - Lukyanenko, Roman
PY - 2020
Y1 - 2020
N2 - Organizations and individuals who use crowdsourcing to collect data prefer knowledgeable contributors. They train recruited contributors, expecting them to provide better quality data than untrained contributors. However, selective attention theory suggests that, as people learn the characteristics of a thing, they focus on only those characteristics needed to identify the thing, ignoring others. In observational crowdsourcing, selective attention might reduce data diversity, limiting opportunities to repurpose and make discoveries from the data. We examine how training affects the diversity of data in a citizen science experiment. Contributors, divided into explicitly and implicitly trained groups and an untrained (control) group, reported artificial insect sightings in a simulated crowdsourcing task. We found that trained contributors reported less diverse data than untrained contributors, and explicit (rule-based) training resulted in less diverse data than implicit (exemplar-based) training. We conclude by discussing implications for designing observational crowdsourcing systems to promote data repurposability.
AB - Organizations and individuals who use crowdsourcing to collect data prefer knowledgeable contributors. They train recruited contributors, expecting them to provide better quality data than untrained contributors. However, selective attention theory suggests that, as people learn the characteristics of a thing, they focus on only those characteristics needed to identify the thing, ignoring others. In observational crowdsourcing, selective attention might reduce data diversity, limiting opportunities to repurpose and make discoveries from the data. We examine how training affects the diversity of data in a citizen science experiment. Contributors, divided into explicitly and implicitly trained groups and an untrained (control) group, reported artificial insect sightings in a simulated crowdsourcing task. We found that trained contributors reported less diverse data than untrained contributors, and explicit (rule-based) training resulted in less diverse data than implicit (exemplar-based) training. We conclude by discussing implications for designing observational crowdsourcing systems to promote data repurposability.
M3 - Conference contribution
BT - International Conference on Information Systems
ER -