DialCrowd is part of the on-going NSF-funded research project, DialPort, built by the Dialog Research Center (DialRC) at Carnegie Mellon's Language Technologies Institute. The primary aim of DialCrowd is to act as a dialog crowdsourcing tool that provides guidance for high-quality data collection.

It reduces the time spent by requesters on more trivial aspects of creating the task with a straightforward, guided interface that results in a cleanly organized task for the worker. Quality control tasks such as duplicated tasks and golden data are added so requesters can easily compare workers’ results to a standard. Other quality tools include flagging work based on time, patterns, and agreement. All in all, DialCrowd aims to provide a seamless, positive experience for both the requester and the worker for higher quality tasks and data. Currently, there are 6 sites using our toolkit.

You can check out DialCrowd and our paper.

Mechanisms for Data Quality

The statistics requesters can view on the completed tasks include some tools to flag work that may need to be checked again. These tools include checking the time spent on the task, any patterns (ex. always choosing A) present, agreement, and quality control tasks (duplicated tasks, golden data).

Outlier and Anomaly Detection

DialCrowd tracks the time taken by a workers to complete a task. If the time is significantly shorter or longer than the average, the task could have been done by a bot.

DialCrowd also detects abnormal patterns in the annotations. For example, a submission where the first choice is selected for all of the questions is abnormal.

Inter-worker Agreement

DialCrowd measures the inter-worker agreement for each task unit. It helps you detect annotations that are different than the majority. Those annotations are likely to be of lower quality.

Quality Control Tasks

DialCrowd allows requesters to include golden and duplicated tasks. Golden tasks are tasks that already have ground truth answers. Duplicated tasks are tasks that will be shown to a worker twice in the same session. By checking the annotations of these tasks, DialCrowd can detect annotations done less carefully.


Here are some examples. It can be seen that DialCrowd provides a working and clear interface for the worker, improving the worker’s experience and giving the requester the data they need.