← Back to Publications List

Resource Aware Financial Text Classification Using Hybrid Vectorizers and Tree Based Models

Students & Supervisors

Student Authors
Sajedul Islam
Bachelor of Science in Computer Science & Engineering, FST
Prithanjoly Biswas Pew
Bachelor of Science in Computer Science & Engineering, FST
Md.muktadir Alam Khiljee
Bachelor of Science in Computer Science & Engineering, FST
Lamia Haque Chandni
Bachelor of Science in Computer Science & Engineering, FST
Tahsinul Islam Nishat
Bachelor of Science in Computer Science & Engineering, FST
Supervisors
Abu Shufian
Lecturer, Faculty, FE

Abstract

Financial fraud detection in corporate filings is challenging due to the scale and variability of textual disclosures. This study proposes a resource-aware hybrid framework that integrates traditional machine learning classifiers with transformer-based models, enhanced by adversarial training to simulate real-world obfuscation. Five vectorization methods and five classifiers were systematically evaluated based on accuracy, execution time, memory usage, CPU utilization, inference latency, and robustness. XGBoost combined with TF-IDF achieved perfect classification (F1=1.00; AUC ROC=1.00) in under four seconds of training time, using only 1.8GB RAM, 35% CPU utilization, and 11ms sample inference latency, and obtained the highest composite resource-accuracy score, confirming its suitability for large-scale deployment. Compared to transformer-only baselines, this configuration reduced memory footprint by 72% and training time by 12-15 times while maintaining perfect detection. While transformer-based models captured richer semantic information, they incurred significantly higher computational costs without improving classification performance. These findings highlight the practicality of lightweight vectorizers for financial fraud detection and establish a benchmark for future resource-efficient and domain-robust implementations.

Keywords

Adversarial Training Financial Fraud Detection Sparse Vectorization Transformer Models Text Classification Framework.

Publication Details

  • Type of Publication:
  • Conference Name: 3rd International Conference on Big Data, IoT and Machine Learning (BIM 2025)
  • Date of Conference: 25/09/2025 - 25/09/2025
  • Venue: Dhaka International University, Bangladesh
  • Organizer: Springer, DIU