This paper addresses the problem of automatically detecting fake news spreaders in social networks such as Twitter. We model the problem as a binary classification task and consider several groups of features, including writing style, word and char n-grams, BERT semantic embedding, and sentiment analysis, which are computed from a set of tweets each user authored. Our proposed approach is evaluated on the dataset made available by the PAN at CLEF 2020 shared task on profiling fake news spreader, which provided labeled data in both English and Spanish. Experimental results show that we can detect fake news spreaders with an accuracy of 0.73 in English and 0.77 in Spanish when our approach is evaluated with 10-fold cross-validation on the provided training set, and with an accuracy of 0.71 in English and 0.76 in Spanish when the model is trained on the whole training set and tested on the provided test set. We also investigate the role of psycho-linguistic (LIWC) and personality features to detect fake news spreaders and find out that personality features do have a significant impact in user sharing behavior, achieving an accuracy of 0.72 in English and 0.80 in Spanish when evaluated with 10-fold cross-validation on the provided training set.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Shrestha, Anu; Spezzano, Francesca; and Joy, Abishai. (2020). "Detecting Fake News Spreaders in Social Networks via Linguistic and Personality Features: Notebook for PAN at CLEF 2020". CEUR Workshop Proceedings, 2696.