EURO 2025 Leeds
Abstract Submission

2163. Enhancing Spam Detection with Large Language Models and Hybrid Learning Approaches

Invited abstract in session MC-43: Digital Philosophy, Academic career models and OR, stream OR and Ethics.

Monday, 12:30-14:00
Room: Newlyn GR.07

Authors (first author is the speaker)

1. Madina Mansurova
KazNU
2. Leonidas Sakalauskas
Vytautas Magnus University
3. Assem Shormakova
Information Systems, KazNU
4. Mukhammed-Ali Kumisbek
Information Systems, KazNU

Abstract

Detecting spam and phishing attacks requires adaptive models that can distinguish between legitimate and deceptive messages. Traditional supervised methods, such as logistic regression, classify known spam patterns with high precision but struggle against novel threats. Unsupervised models, like Isolation Forest, detect anomalies without prior labels but often misclassify legitimate messages. A hybrid approach combining both methods enhances detection accuracy by leveraging pattern recognition from supervised learning and anomaly detection from unsupervised techniques.
Large Language Models (LLMs) further improve classification by extracting deep linguistic features that traditional methods overlook. Transformer-based models such as BERT and LLaMA enhance spam filtering by understanding contextual nuances, making them effective against evolving phishing tactics. The hybrid system integrates LLMs with machine learning models, increasing recall while maintaining precision.
Evaluation results show that combining logistic regression, Isolation Forest, and LLM embeddings reduces false negatives and improves detection of unseen phishing attempts. While computational costs increase, adaptive thresholding optimizes inference speed. Future improvements include expanding multilingual detection and integrating retrieval-augmented generation for enhanced phishing prevention. The hybrid approach provides a scalable, robust solution for modern spam filtering challenges.

Keywords

Status: accepted


Back to the list of papers