Skip to content
2000
image of Patented Study on BERT-Based Combined Approach for Fake News Detection

Abstract

Advanced technologies on the internet create an environment for information exchange among communities. However, some individuals exploit these environments to spread false news. False News, or Fake News (FN), refers to misleading information deliberately crafted to harm the reputation of individuals, products, or services. Identifying FN is a challenging issue for the research community. Many researchers have proposed approaches for FN detection using Machine Learning (ML) and Natural Language Processing (NLP) techniques. In this article, we propose a combined approach for FN detection, leveraging both ML and NLP techniques. We first extract all terms from the dataset after applying appropriate preprocessing techniques. A Feature Selection Algorithm (FSA) is then employed to identify the most important features based on their scores. These selected features are used to represent the dataset documents as vectors. The term weight measure determines the significance of each term in the vector representation. These document vectors are combined with vector representations obtained through an NLP technique. Specifically, we use the Bidirectional Encoder Representations from Transformers (BERT) model to represent the document vectors. The BERT small case model is employed to generate features, which are then used to create the document vectors. The combined vector, comprising ML-based document vector representations and NLP-based vector representations, is fed into various ML algorithms. These algorithms are used to build a model for classification. Our combined approach for FN detection achieved the highest accuracy of 96.72% using the Random Forest algorithm, with document vectors that included content-based features of size 4000 concatenated with outputs from the 9th to 12th BERT encoder layers.

Loading

Article metrics loading...

/content/journals/eng/10.2174/0118722121300281240823174052
2024-10-02
2024-11-26
Loading full text...

Full text loading...

References

  1. Tolmie P. Procter R. Randall D.W. Rouncefield M. Burger C. Wong Sak Hoi G. Zubiaga A. Liakata M. Supporting the use of user generated content in journalistic practice Proceedings of the 2017 chi conference on human factors in computing systems Denver, Colorado, USA, 02 May 2017 3632 3644 10.1145/3025453.3025892
    [Google Scholar]
  2. Lazer D.M.J. Baum M.A. Benkler Y. Berinsky A.J. Greenhill K.M. Menczer F. Metzger M.J. Nyhan B. Pennycook G. Rothschild D. Schudson M. Sloman S.A. Sunstein C.R. Thorson E.A. Watts D.J. Zittrain J.L. The science of fake news. Science 2018 359 6380 1094 1096 10.1126/science.aao2998 29590025
    [Google Scholar]
  3. Zannettou S. Sirivianos M. Blackburn J. Kourtellis N. The web of false information: Rumors, fake news, hoaxes, clickbait, and vari-ous other shenanigans. ACM J. Data Inf. Qual. 2019 11 3 1 37 10.1145/3309699
    [Google Scholar]
  4. Waweru J. “Understanding Fake News”. Article in International Journal of Scientific and Research Publications. IJSRP 2019 9 1 8505
    [Google Scholar]
  5. Koohikamali M. Sidorova A. Information re-sharing on social network sites in the age of fake news. Inf. Sci. 2017 20 215 235 10.28945/3871
    [Google Scholar]
  6. Zech S.T. Gabbay M. Social network analysis in the study of terrorism and insurgency: From organization to politics. Int. Stud. Rev. 2016 18 2 214 243 10.1093/isr/viv011
    [Google Scholar]
  7. Chetty N. Alathur S. Hate speech review in the context of online social networks. Aggress. Violent. Behav. 2018 40 108 118 10.1016/j.avb.2018.05.003
    [Google Scholar]
  8. Tirupathi Kumar B. Vishnu Vardhan B. A review on fake news spreaders detection. GIS SCIENCE JOURNAL 2021 8 5 388 402
    [Google Scholar]
  9. Alam S. Ravshanbekov A. Sieving Fake News From Genuine: A Synopsis arXiv preprint 2019
    [Google Scholar]
  10. Manzoor S.I. Singla J. Fake News Detection Using Machine Learning approaches: A systematic Review In 2019 3rd International Conference on Trends in Electronics and Informatics, 2019 Tirunelveli, India, 23-25 April 2019 230 234 10.1109/ICOEI.2019.8862770
    [Google Scholar]
  11. Traylor T. Straub J. Snell N. Classifying fake news articles using natural language processing to identify in-article attribution as a supervised learning estimator In 2019 IEEE 13th International Conference on Semantic Computing 2019 Newport Beach, CA, USA, 30 January 2019 - 01 February 2019 445 449 10.1109/ICOSC.2019.8665593
    [Google Scholar]
  12. Gelfert A. Fake news: A definition. Informal Log. 2018 38 1 84 117 10.22329/il.v38i1.5068
    [Google Scholar]
  13. Kaur S. Kumar P. Kumaraguru P. Automating fake news detection system using multi-level voting model. Soft Comput. 2019 1 21
    [Google Scholar]
  14. Waikhom L. Goswami R.S. Fake News Detection Using Machine Learning SSRN 3462938 2019
    [Google Scholar]
  15. Altunbey Ozbay F. Alatas B. A Novel Approach for Detection of Fake News on Social Media Using Metaheuristic Optimization Algo-rithms. Elektron. Elektrotech. 2019 25 4 62 67 10.5755/j01.eie.25.4.23972
    [Google Scholar]
  16. Faustini P. Covões T.F. Fake News Detection Using One-Class Classification In 2019 8th Brazilian Conference on Intelligent Systems 2019 592 597 Salvador, Brazil, 15-18 October 2019 10.1109/BRACIS.2019.00109
    [Google Scholar]
  17. Ahmed H. Traore I. Saad S. Detection of online fake news using N- gram analysis and machine learning techniques International Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments Springer:Cham 2017 127 138 10.1007/978‑3‑319‑69155‑8_9
    [Google Scholar]
  18. Li Q. Hu Q. Lu Y. Yang Y. Cheng J. Multi-level word features based on CNN for fake news detection in cultural communication. Pers. Ubiquitous Comput. 2019 1 14
    [Google Scholar]
  19. Giachanou A. Rosso P. Crestani F. Leveraging emotional signals for credibility detection Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval 2019 877 880 Paris, France, 18 July 2019
    [Google Scholar]
  20. Granik M. Mesyura V. Fake news detection using naive Bayes classifier 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON) IEEE. 2017 900 903 10.1109/UKRCON.2017.8100379
    [Google Scholar]
  21. Bhattacharjee S.D. Talukder A. Balantrapu B.V. Active learning based news veracity detection with feature weighting and deep-shallow fusion 2017 IEEE International Conference on Big Data 2017 Boston, MA, USA 11-14 December 556 565 10.1109/BigData.2017.8257971
    [Google Scholar]
  22. Janze C. Risius M. Automatic Detection of Fake News on Social Media Platforms In PACIS. 2017 261
    [Google Scholar]
  23. Kim K.H. Jeong C.S. Fake News Detection System using Article Abstraction In 2019 16th International Joint Conference on Computer Science and Software Engineering Chonburi, Thailand, 10-12 July 2019 209 212 10.1109/JCSSE.2019.8864154
    [Google Scholar]
  24. Zhou X. Zafarani R. Network-based Fake News Detection. SIGKDD Explor. 2019 21 2 48 60 10.1145/3373464.3373473
    [Google Scholar]
  25. Kumar S. Singh T.D. Fake news detection on Hindi news dataset. Glob. Transit. Proc. 2022 3 1 289 297 10.1016/j.gltp.2022.03.014
    [Google Scholar]
  26. A benchmark study of machine learning models for online fake news detection, Machine Learning with Applications
  27. Rai N. Kumar D. Kaushik N. Raj C. Ali A. Fake News Classification using transformer based enhanced LSTM and BERT. Interna-tional Journal of Cognitive Computing in Engineering 2022 3 March 98 105 10.1016/j.ijcce.2022.03.003
    [Google Scholar]
  28. Nasir Jamal Abdul Khan Osama Subhani Varlamis Iraklis Fake news detection: A hybrid CNN-RNN based deep learning approach International Journal of Information Management Data Insights 2021 1 10.1016/j.jjimei.2020.100007
    [Google Scholar]
  29. Chauhan Tavishee Optimization and improvement of fake news detection using deep learning approaches for societal benefit. International Journal of Information Management Data Insights 2021 1
    [Google Scholar]
  30. Singh M. Wasim Bhatt M. Performance of bernoulli’s naive bayes classifier in the detection of fake news Mater. Today Proc. 10.1016/j.matpr.2020.10.896
    [Google Scholar]
  31. Huang Y-F. Chen P-H. 10.1016/j.eswa.2020.113584
  32. Kaliyar R.K. Goswami A. Narang P. Sinha S. FNDNet – A deep convolutional neural network for fake news detection. Cogn. Syst. Res. 2020 61 32 44 10.1016/j.cogsys.2019.12.005
    [Google Scholar]
  33. ISOT LAB Available at: https://www.uvic.ca/engineering/ece/isot/datasets/fake-news/index.php
  34. Zhao Z. Liu H. Spectral feature selection for supervised and unsupervised learning In Proceedings of the 24th International Conference on Machine Learning 2007 1151 1157 10.1145/1273496.1273641
    [Google Scholar]
  35. Labani M. Moradi P. Ahmadizar F. Jalili M. A novel multivariate filter method for feature selection in text classification problems. Eng. Appl. Artif. Intell. 2018 70 25 37 10.1016/j.engappai.2017.12.014
    [Google Scholar]
  36. Chen K. Zhang Z. Long J. Zhang H. Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Syst. Appl. 2016 66 245 260 10.1016/j.eswa.2016.09.009
    [Google Scholar]
  37. Schölkopf B. Burges C.J. Advances in kernel methods: Support vector learning
    [Google Scholar]
  38. Hearst M.A. Dumais S.T. Osuna E. Platt J. Scholkopf B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998 13 4 18 28 10.1109/5254.708428
    [Google Scholar]
  39. Valyon J. Horváth G. A weighted generalized ls-SVM. Period. Polytech. Electr. Eng. 2003 47 3-4 229 252
    [Google Scholar]
  40. Quinlan J.R. Induction of decision trees. Mach. Learn. 1986 1 1 81 106 10.1007/BF00116251
    [Google Scholar]
  41. Kass G.V. An exploratory technique for investigating large quantities of categorical data. Appl. Stat. 1980 29 2 119 127 10.2307/2986296
    [Google Scholar]
  42. Breiman L. Friedman J. Stone C.J. Olshen R.A. Classification and regression trees
    [Google Scholar]
  43. Quinlan J.R. C4. 5: programs for machine learning
    [Google Scholar]
  44. Lam L. Suen S.Y. Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Trans. Syst. Man Cybern. A Syst. Hum. 1997 27 5 553 568 10.1109/3468.618255
    [Google Scholar]
  45. Chang J.D.a. Open Sourcing BERT: State-of-the-Art Pretraining for Natural Language Processing
    [Google Scholar]
  46. Devlin J. Chang M-W. Lee K. Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding preprint arXiv 2018
    [Google Scholar]
  47. Kavuri K. Kavitha M. A Word Embeddings based Approach for Author Profiling: Gender and Age Prediction. Int. J. Recent Innov. Trends Comput. Commun. 2023 11 7s 239 250 10.17762/ijritcc.v11i7s.6996
    [Google Scholar]
  48. Farhangi V. Moradi M.J. Daneshvar K. Hajiloo H. Application of artificial intelligence in predicting the residual mechanical properties of fiber reinforced concrete (FRC) after high temperatures. Constr. Build. Mater. 2024 411 134609 10.1016/j.conbuildmat.2023.134609
    [Google Scholar]
  49. Kahloot K.M. Ekler P. Algorithmic splitting: A method for dataset preparation. IEEE Access 2021 9 125229 125237 10.1109/ACCESS.2021.3110745
    [Google Scholar]
/content/journals/eng/10.2174/0118722121300281240823174052
Loading
/content/journals/eng/10.2174/0118722121300281240823174052
Loading

Data & Media loading...


  • Article Type:
    Review Article
Keywords: FN detection ; term weight measure ; BERT ; FN ; feature selection technique
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test