Full text loading...
-
Patented Study on BERT-Based Combined Approach for Fake News Detection
-
-
- 07 Feb 2024
- 26 Jun 2024
- 02 Oct 2024
- Previous Article
- Table of Contents
- Next Article
Abstract
Advanced technologies on the internet create an environment for information exchange among communities. However, some individuals exploit these environments to spread false news. False News, or Fake News (FN), refers to misleading information deliberately crafted to harm the reputation of individuals, products, or services. Identifying FN is a challenging issue for the research community. Many researchers have proposed approaches for FN detection using Machine Learning (ML) and Natural Language Processing (NLP) techniques. In this article, we propose a combined approach for FN detection, leveraging both ML and NLP techniques. We first extract all terms from the dataset after applying appropriate preprocessing techniques. A Feature Selection Algorithm (FSA) is then employed to identify the most important features based on their scores. These selected features are used to represent the dataset documents as vectors. The term weight measure determines the significance of each term in the vector representation. These document vectors are combined with vector representations obtained through an NLP technique. Specifically, we use the Bidirectional Encoder Representations from Transformers (BERT) model to represent the document vectors. The BERT small case model is employed to generate features, which are then used to create the document vectors. The combined vector, comprising ML-based document vector representations and NLP-based vector representations, is fed into various ML algorithms. These algorithms are used to build a model for classification. Our combined approach for FN detection achieved the highest accuracy of 96.72% using the Random Forest algorithm, with document vectors that included content-based features of size 4000 concatenated with outputs from the 9th to 12th BERT encoder layers.