In this paper, new active learning methods are proposed to filter Chinese spam. It is time-consuming and expensive to label the spam emails in the large datasets. Active learning methods can conspicuously reduce labeling cost by identifying informative examples and speed up online Logistic Regression filter. The experiments illustrate that our methods not only decrease the number of label requests, but also improve the classification performance of spam filtering.
展开▼