DeepTrust: An Automatic Framework to Detect Trustworthy Users in Opinion-Based Systems

Document Type

Conference Proceeding

Publication Date



Opinion spamming has recently gained attention as more and more online platforms rely on users’ opinions to help potential customers make informed decisions on products and services. Yet, while work on opinion spamming abounds, most efforts have focused on detecting an individual reviewer as spammer or fraudulent. We argue that this is no longer sufficient, as reviewers may contribute to an opinion-based system in various ways, and their input could range from highly informative to noisy or even malicious.

In an effort to improve the detection of trustworthy individuals within opinion-based systems, in this paper, we develop a supervised approach to differentiate among different types of reviewers. Particularly, we model the problem of detecting trustworthy reviewers as a multi-class classification problem, wherein users may be fraudulent, unreliable or uninformative, or trustworthy. We note that expanding from the classic binary classification of trustworthy/untrustworthy (or malicious) reviewers is an interesting and challenging problem. Some untrustworthy reviewers may behave similarly to reliable reviewers, and yet be rooted by dark motives. On the contrary, other untrustworthy reviewers may not be malicious but rather lazy or unable to contribute to the common knowledge of the reviewed item.

Our proposed method, DeepTrust, relies on a deep recurrent neural network that provides embeddings aggregating temporal information: we consider users’ behavior over time, as they review multiple products. We model the interactions of reviewers and the products they review using a temporal bipartite graph and consider the context of each rating by including other reviewers’ ratings of the same items. We carry out extensive experiments on a realworld dataset of Amazon reviewers, with known ground truth about spammers and fraudulent reviews. Our results show that DeepTrust can detect trustworthy, uninformative, and fraudulent users with an F1-measure of 0.93. Also, we drastically improve on detecting fraudulent reviewers (AUROC of 0.97 and average precision of 0.99 when combining DeepTrust with the F&G algorithm) as compared to REV2 state-of-the-art methods (AUROC of 0.79 and average precision of 0.48). Further, DeepTrust is robust to cold start users and overperforms all existing baselines.