2022

Adversarial Analysis of Fake News Detectors

Annat Koren, City College of San FranciscoFollow
Francesca Spezzano, Boise State UniversityFollow
Hunter Ireland, Boise State UniversityFollow
Eryn Jagelski-Buchler, Boise State UniversityFollow
Sandra Luo, Timberline High SchoolFollow
Edoardo Serra, Boise State UniversityFollow
Marion Scheepers, Boise State UniversityFollow

Additional Funding Sources

This research has been sponsored by the National Science Foundation under Award No. 1950599.

Presentation Date

7-2022

Abstract

In recent years, fake news detection models have been developed to mitigate the problem of fake news. One example is dEFEND, a state-of-the-art natural language processing (NLP) model that utilizes both the news contents as well as comments on these to aid in detecting fake news. In order to prevent intentional misclassifications caused by malicious actors, we aim to show there is a vulnerability in the model, then, show how we can make the system more robust against that vulnerability.Attacks on fake news detection models are a growing concern and an active area of research. One product of this is MALCOM, a malicious comment generator that can force a fake or real classification of a news piece with a success rate upwards of 93%. MALCOM generates stylistically similar and topic-relevant comments to the input text, alleviating common problems with most attacks on NLP models (e.g., producing nonsensical examples). However, it is possible to detect that these comments are computer-generated. Our objective is to instead use real, user-generated comments sourced from the same dataset so that they are indistinguishable from the rest.Using the FakeNewsNet dataset, we develop an attack by grouping articles and their preexisting comments into topics, and then computing their similarity, or "distance" from each other. With this metric, we identify generic as well as topic-specific comments that can be used to sway dEFEND’s classification of an article. An ongoing area of exploration is creating a defense to mitigate these attacks, for example through the filtering of comments based on properties we have identified as being adversarial.

This document is currently not available here.

COinS

Adversarial Analysis of Fake News Detectors

Idaho Conference on Undergraduate Research

2022

Adversarial Analysis of Fake News Detectors

Additional Funding Sources

Presentation Date

Abstract

Browse

Links

Search

Author Corner

LINKS

Idaho Conference on Undergraduate Research

2022

Adversarial Analysis of Fake News Detectors

Presenter/Author/Student Information

Additional Funding Sources

Presentation Date

Abstract

Share

Browse

Links

Search

Author Corner

LINKS