Abstract:
|
Email spam filters have been universally applied in reality. Spams, however, are also sent via text messages. In this project, multiple popular algorithms for Email spam filtering are implemented on a Short Message Service (SMS) dataset to see if they will successfully identify spams as well. Two different methods for representing the dataset using matrices were attempted. In addition to utilizing only tokens, other characteristics of the message, such as proportion of numbers or capital letters, were explored. The final classification results are presented, and a few caveats when applying these algorithms will be discussed.
|