Trust management in crowdsourcing environments
thesisposted on 28.03.2022, 10:28 authored by Bin Ye
As a cost-effective model for solving problems, crowdsourcing has been widely applied in various human intelligence tasks, such as data labeling, data translation, and prediction. However, without adequate trust management, a large number of untrustworthy workers submit low-quality or even junk answers in the tasks to benefit themselves or sabotage their competitors' crowdsourcing processes. The disturbance or attacks not only significantly increase the cost of solving a task, but also drastically reduce the effectiveness of crowdsourcing processes. Therefore, selecting trustworthy workers to participate in tasks has become a top-priority demand in crowdsourcing environments. To achieve an effective trustworthy worker selection, three challenging sub-problems including context-aware trust evaluation, spam worker defense, and trustworthy worker recommendation have to be tackled. As such, in this thesis, we systematically propose our solutions for the three sub-challenges. The main contributions are summarized as follows. In a crowdsourcing platform, a worker's trustworthiness varies in different contexts, complicating the trust evaluation of a crowdsourcing worker. Thus, we propose a new context-aware trust model that evaluates a worker's trust in two primary crowdsourcing contexts, i.e., the context of task type and the context of reward amount, respectively. In particular, we first propose a task type taxonomy and a task reward amount taxonomy. Based on them, we devise two novel context-aware trust metrics: Task Type-aware Trust (TaTrust) and Reward Amount-aware Trust (RaTrust). Finally, we devise a multi-objective combinatorial optimization algorithm to effectively select trustworthy workers. To defend against the threats from the spam workers who masquerade themselves as "trustworthy" workers with "good" reputations by colluding with their accomplices, we propose a new spam worker defense model based on our proposed Worker Trust Vector (WTV). A WTV consisting of the trust opinions from different requesters can indicate a worker's global trust level. Based on the workers' WTVs, we then propose an algorithm to effectively defend against spam workers. Moreover, to effectively and proactively identify spam workers, we propose a novel spam worker identification model. In this model, we first devise a novel worker trust representation called Worker Trust Matrix (WTM). A worker's WTM is essentially a global trust feature set where each element is a local trust indicator called trust trace. A trust trace measures the extent to which a requester trusts a worker in a trust subnetwork centering on the requester. Taking the WTMs as input, we then devise a learning-based algorithm to predict each worker's identity. With our proposed WTM-based model, spam workers are precisely identified and then prohibited from participating in the tasks. Furthermore, we propose a novel trust-aware model to recommend trustworthy workers to participate in tasks. In this model, we tackle the homogeneous worker, dishonest behaviours, data sparsity, and cold start problems in generating worker recommendations. In particular, we first propose two similarity metrics to measure two requesters' similarities in transacting with the workers they commonly trust and the workers they commonly distrust, respectively. Targeting the data sparsity problem, we propose a new trust sub-network extraction algorithm (TSE) to effectively discover requesters who can provide trustworthy recommendation suggestions. Finally, we suggest two strategies for solving the cold start problem. All the models proposed in this thesis have been validated and evaluated through extensive experiments on real datasets or real scenarios. The results have demonstrated that the proposed models significantly outperform the comparable models in the existing studies in terms of effectively selecting trustworthy workers.