DialCrowd is part of the on-going NSF-funded research project, DialPort, built by the Dialog Research Center (DialRC) at Carnegie Mellon's Language Technologies Institute. The primary aim of DialCrowd is to act as a dialog crowdsourcing tool that provides guidance for high-quality data collection.
It reduces the time spent by requesters on more trivial aspects of creating the task with a straightforward, guided interface that results in a cleanly organized task for the worker. Quality control tasks such as duplicated tasks and golden data are added so requesters can easily compare workers’ results to a standard. Other quality tools include flagging work based on time, patterns, and agreement. All in all, DialCrowd aims to provide a seamless, positive experience for both the requester and the worker for higher quality tasks and data. Currently, there are 6 sites using our toolkit.
DialCrowd enables requesters to avoid directly interacting with HTML code by providing predefined slots that they can fill out in order. Each section on the interface corresponds directly to a section on the finished task.
Clarity is always important in a task in order for workers to provide the data that requesters are expecting to obtain. With detailed guidance for instruction and example writing, DialCrowd achieves a higher level of clarity for higher quality data.
The statistics requesters can view on the completed tasks include some tools to flag work that may need to be checked again. These tools include checking the time spent on the task, any patterns (ex. always choosing A) present, agreement, and quality control tasks (duplicated tasks, golden data).
DialCrowd tracks the time taken by a workers to complete a task. If the time is significantly shorter or longer than the average, the task could have been done by a bot.
DialCrowd also detects abnormal patterns in the annotations. For example, a submission where the first choice is selected for all of the questions is abnormal.
DialCrowd measures the inter-worker agreement for each task unit. It helps you detect annotations that are different than the majority. Those annotations are likely to be of lower quality.
DialCrowd allows requesters to include golden and duplicated tasks. Golden tasks are tasks that already have ground truth answers. Duplicated tasks are tasks that will be shown to a worker twice in the same session. By checking the annotations of these tasks, DialCrowd can detect annotations done less carefully.