Supervised Learning based E-mail/ SMS Spam Classifier

Satendra Kumar; Raj Kumar; Ashish Saini

doi:10.2174/0126662558279046240126051302

ISSN: 2666-2558
E-ISSN: 2666-2566

Supervised Learning based E-mail/ SMS Spam Classifier
Authors: Satendra Kumar¹, Raj Kumar² and Ashish Saini³
View Affiliations Hide Affiliations

¹ Department of CSE, Moradabad Educational Trust Group of Institutions, UP, India ; ² Department of Computer Science, Gurukula Kangri (Deemed to be University), Haridwar, Uttarakhand, India ; ³ Departtment of Computer Science and Engineering, Quantum University, Roorkee, Uttarakhand, India
Source: Recent Advances in Computer Science and Communications, Volume 18, Issue 3, May 2025, E100624230882
DOI: https://doi.org/10.2174/0126662558279046240126051302
- Received: 28 Oct 2023
- Accepted: 10 Jan 2024
- Available online: 10 Jun 2024

Abstract

Background

One of the challenging problems facing the modern Internet is spam, which can annoy individual customers and wreak financial havoc on businesses. Spam communications target customers without their permission and clog their mailboxes. They consume more time and organizational resources when checking for and deleting spam. Even though most web users openly dislike spam, enough are willing to accept lucrative deals that spam remains a real problem. While most web users are well aware of their hatred of spam, the fact that enough of them still click on commercial offers means spammers can still make money from them. While most customers know what to do, they need clear instructions on avoiding and deleting spam. No matter what you do to eliminate spam, you won't succeed. Filtering is the most straightforward and practical technique in spam-blocking strategies.

Methods

We present procedures for identifying emails as spam or ham based on text classification. Different methods of e-mail organization preprocessing are interrelated, for example, applying stop word exclusion, stemming, including reduction and highlight selection strategies to extract buzzwords from each quality, and finally, using unique classifiers to Quarantine messages as spam or ham.

Results

The Nave Bayes classifier is a good choice. Some classifiers, such as Simple Logistic and Adaboost, perform well. However, the Support Vector Machine Classifier (SVC) outperforms it. Therefore, the SVC makes decisions based on each case's comparisons and perspectives.

Conclusion

Many spam separation studies have focused on recent classifier-related challenges. Machine Learning (ML) for spam detection is an important area of modern research. Today, spam detection using ML is an important area of research. Examine the adequacy of the proposed work and recognize the application of multiple learning estimates to extract spam from emails. Similarly, estimates have also been scrutinized.

Article metrics loading...

/content/journals/rascs/10.2174/0126662558279046240126051302

2024-06-10

2026-02-28

From This Site

/content/journals/rascs/10.2174/0126662558279046240126051302

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

References

PuC. WebbS. Observed trends in spam construction techniques: A case study of spam evolution.Proceeding of 3rd Conference on E-Mail and Anti-Spam 2006 Mountain View, California, USA
[Google Scholar]
EmbrechtsM. SzymanskiB. SternickelK. NaennaT. BragaspathiR. Use of ML for classification of Magnetocardiograms.Proceedings of IEEE Conference on System, Man and Cybernetics2003Washington DC2140005
[Google Scholar]
BlanzieriE. BrylA. A survey of learning-based techniques of email spam filtering.Artif. Intell. Rev.2008291639210.1007/s10462‑009‑9109‑6
[Google Scholar]
BekkerS. Spam to Cost U.S. Companies $10 Billion in 2003.2023Avaialble From: http://www.entmag.com/news/article.asp?EditorialsID=565
KumarS. Collaborative Filtering-based Test Case Prioritization and Reduction for Software Product-Line Testing.TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON) 2019 Kochi, India10.1109/TENCON.2019.8929705
[Google Scholar]
JainG. SharmaM. AgarwalB. Optimizing semantic LSTM for spam detection.Int J Inf. Technol201911223925010.1007/s41870‑018‑0157‑5
[Google Scholar]
KumarN. SonowalS. Email spam detection using ML algorithms.Proceedings of the 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) , 2020pp. 108-113 Coimbatore, India
[Google Scholar]
KumarS. KumarR. MittalM. A hybrid approach to perform test case prioritisation and reduction for software product line testing.Int J Vehicle Autonom Sys2020153/419722410.1504/IJVAS.2020.116439
[Google Scholar]
TretyakovK. ML techniques in spam filtering: Data mining problem-oriented seminar.Seman. Scholar2004
[Google Scholar]
CookD. HartnettJ. MandersonK. ScanlanJ. “Catching Spam before it arrives: Domain specific dynamic blacklists”, in ACSW Frontiers, Australian.Comput. Soc.200654193202
[Google Scholar]
PuniškisD. LaurutisR. DirmeikisR. An Artificial Neural Nets for Spam e-mail Recognition.Electron Electric Eng20066957376
[Google Scholar]
McLeodD. YounS. A comparative study for email classification.Advances and Innovations in Systems, Computing Sciences and Software Engineering SpringerNew York2006387391
[Google Scholar]
ChenY. XiaR. YangK. GCAM: Lightweight image inpainting via group convolution and attention mechanism.Int J Mach Learn Cybernet20232023
[Google Scholar]
ChenY. XiaR. YangK. ZouK. MFMAM: Image inpainting via multi-scale feature module with attention module.Comp Vis Image Understand2024238103883
[Google Scholar]
ChenY. XiaR. YangK. ZouK. DGCA: High resolution image inpainting via DR-GAN and contextual attention.Multimed Tools Appl.20238230477514777110.1007/s11042‑023‑15313‑0
[Google Scholar]
ChenY. XiaR. YangK. ZouK. DARGS: Image inpainting algorithm via deep attention residuals group and semantics.J King Saud Uni- Comp Inf Sci2023356101567
[Google Scholar]
KumarS. Test case prioritization techniques for software product line: A survey.2016 International Conference on Computing, Communication and Automation (ICCCA) 2017 Greater Noida, India10.1109/CCAA.2016.7813841
[Google Scholar]
KumarS. MittalM. YadavV.K. Cost-effective product prioritisation technique for software product line testing.Int J Eng Syst Model Simulat2021122/3839310.1504/IJESMS.2021.115518
[Google Scholar]
RushdiS. RobetM. Classification spam emails using text and readability features.2013 IEEE 13th International Conference on Data Mining 2006 Dallas, TX, USA
[Google Scholar]
KishoreR.K. PoonkuzhaliG. SudhakarP. Comparative study on email spam classifier using data mining techniques.Proceedings of the International MultiConference of Engineers and Computers Science Scientists. vol. 1, 2012 Mountain View, California, USA.
[Google Scholar]
UysalA.K. GunalS. The impact of preprocessing on text classification.Inf Proc Manag2014501104112
[Google Scholar]
TanK.L. LeeC.P. LimK.M. A survey of sentiment analysis: Approaches, datasets, and future research.Appl. Sci. (Basel)2023137455010.3390/app13074550
[Google Scholar]
SarkerI.H. Machine Learning: Algorithms, real-world applications and research directions.SN Computer Science20212316010.1007/s42979‑021‑00592‑x33778771
[Google Scholar]
KruschkeJ.K. LiddellT.M. Bayesian data analysis for newcomers.Psychon. Bull. Rev.201825115517710.3758/s13423‑017‑1272‑128405907
[Google Scholar]
MichaelC. Generating software test data by evolution.IEEE Transactions on Software Engineering.2001271210851110
[Google Scholar]
BassiouniM. AliM. El-DahshanE.A. Ham and spam e-mails classification using ML techniques.J. Appl. Secur. Res.201813331533110.1080/19361610.2018.1463136
[Google Scholar]
HarrisonM. PetrouT. Pandas Cookbook: Recipes for scientific computing, time series analysis and data visualization using python.Packt PublishingBirmingham, UK20202nd ed.
[Google Scholar]
WittenI. FrankE. Data Mining: Practical ML Tools and Techniques with Java implementations.. Morgan Kaufmann Publishers: Burlington, Massachusetts, 2000.
[Google Scholar]
B.Sunil Rathod, Tareek M. Pattewar “Content Based Spam Detection in Email using Bayesian Classifier”, presented at the IEEE ICCSP 2015 conference.
[Google Scholar]
RathiM. PareekV. Spam mail detection through data mining – a comparative performance analysis.Int. J Modern Edu. Comp. Sci.2013512313910.5815/ijmecs.2013.12.05
[Google Scholar]
FarisH. Al-ZoubiA.M. HeidariA.A. AljarahI. MafarjaM. HassonahM.A. FujitaH. An intelligent system for spam detection and identification of the most relevant features based on evolutionary Random Weight Networks.Inf. Fusion201948678310.1016/j.inffus.2018.08.002
[Google Scholar]
PandeyU. ChakravertyS. A review of text classification approaches for E-mail management.IACSIT Int. J. Eng. Technol.20113213714410.7763/IJET.2011.V3.212
[Google Scholar]

/content/journals/rascs/10.2174/0126662558279046240126051302

Supervised Learning based E-mail/ SMS Spam Classifier

Recent Advances in Computer Science and Communications 18, E100624230882 (2025); https://doi.org/10.2174/0126662558279046240126051302

/content/journals/rascs/10.2174/0126662558279046240126051302

Data & Media loading...

Article Type: Research Article

Keyword(s): Machine learning; naive bayesian; SMS; spam classification; supervised learning; SVC

Supervised Learning based E-mail/ SMS Spam Classifier

Abstract

From This Site

Most Read This Month

Most Cited Most Cited RSS feed

Key Issues in Software Reliability Growth Models

An Ensemble of Bacterial Foraging, Genetic, Ant Colony and Particle Swarm Approach EB-GAP: A Load Balancing Approach in Cloud Computing

Remaining Useful Life Prediction of Lithium-ion Batteries Using Multiple Kernel Extreme Learning Machine

ROUGE-SS: A New ROUGE Variant for the Evaluation of Text Summarization

Extensive Review of Literature on Explainable AI (XAI) in Healthcare Applications

An Analog Circuit Fault Diagnosis Approach Based on Wavelet-based Fractal Analysis and Multiple Kernel SVM

Research on Monitoring System of Daily Statistical Indexes Through Big Data

A Study on E-Learning and Recommendation System

Container Elasticity: Based on Response Time using Docker

Revolutionizing Agriculture: A Comprehensive Review of IoT Farming Technologies