Robust Fake News Detection using Large Language Models under Adversarial Sentiment Attacks

Image generated by Gemini AI
Researchers have developed AdSent, a new framework that enhances fake news detection by countering sentiment manipulation, a vulnerability exposed by large language models. The study reveals that altering sentiment significantly impacts detection accuracy, favoring neutral articles as genuine. AdSent employs a sentiment-agnostic training strategy, outperforming existing models in robustness and accuracy across various datasets.
New Framework Enhances Fake News Detection Amid Sentiment Manipulation
Research has unveiled a new framework, AdSent, designed to bolster the effectiveness of fake news detection mechanisms against sentiment manipulation tactics. This development responds to the increasing sophistication of misinformation strategies that employ large language models (LLMs) to alter sentiment in news articles.
Prior studies have established sentiment as a vital indicator for identifying fake news, but this reliance exposes vulnerabilities, as adversaries can exploit sentiment cues to bypass detection systems. While some research has examined adversarial samples generated by LLMs, the emphasis has primarily been on stylistic elements rather than sentiment manipulation.
AdSent Framework Overview
- Controlled Sentiment-Based Adversarial Attacks: AdSent generates adversarial samples that specifically target sentiment alterations, providing insights into how sentiment shifts affect detection performance.
- Impact Analysis: Modifications in sentiment significantly influence the performance of fake news detection systems, with neutral articles more frequently classified as real, while non-neutral sentiments are often identified as fake.
- Sentiment-Agnostic Training Strategy: AdSent employs a training strategy that minimizes the influence of sentiment on detection outcomes.
Performance and Generalization
Extensive experiments demonstrate that AdSent surpasses existing competitive baselines in accuracy and improves robustness, effectively generalizing across unseen datasets and various adversarial scenarios.
Related Topics:
📰 Original Source: https://arxiv.org/abs/2601.15277v1
All rights and credit belong to the original publisher.