Hybrid Machine Learning Framework for Phishing URL Detection with Advanced Feature Engineering
Students & Supervisors
Student Authors
Supervisors
Abstract
The fraudulent practice of phishing attacks targets essential user information by creating fake websites that imitate authentic ones. Detection methods based on traditional techniques experience difficulties when dealing with features and adaptability in addition to changing attack patterns which leads to decreased accuracy. The research assesses different machine learning models among competing text analysis techniques that include Count Vectorizer, TF-IDF Vectorizer, Porter Stemmer, and Regex Tokenization for phishing detection evaluation. Random Forest yielded 98.57% accuracy when detecting phishing websites which placed it at the top among the examined models. Strong performance of this model stems from its ability to efficiently manage dataset imbalance and deal with redundant features which leads to high classification reliability and precision. The forthcoming research agenda will explore both deep learning techniques and advanced feature selection methods alongside real-time phishing scanning capabilities to reinforce overall cybersecurity protection.
Keywords
Publication Details
- Type of Publication: Conference
- Conference Name: The 8th International Conference on Engineering Research, Innovation and Education (ICERIE 2025)
- Date of Conference: 24/04/2025 - 24/04/2025
- Venue: Shahjalal University of Science and Technology (SUST)
- Organizer: School of Applied Sciences & Technology Shahjalal University of Science and Technology