Feature Selection Method on Imbalanced Text
doi: 10.3969/j.issn.1001-0548.2012.04.022
- Received Date: 2010-10-15
- Rev Recd Date: 2011-06-29
- Publish Date: 2012-08-15
-
Key words:
- feature selection /
- imbalanced dataset /
- strong class-related /
- text classification
Abstract: After analyzing the four basic information elements of traditional feature selection methods, a new measurement of strong class information is introduced and a new feature selection method is proposed for imbalanced text classification. The strong class information and the frequency of terms are used to improve the classification performance of minority classes and majority classes respectively. The experiments on reuter-21578 dataset show that the proposed method is better than IG and CHI. Both Micro F1 and Macro F1 are improved to some degree.
Citation: | LIAO Yi-xing, PAN Xue-zeng. Feature Selection Method on Imbalanced Text[J]. Journal of University of Electronic Science and Technology of China, 2012, 41(4): 592-595. doi: 10.3969/j.issn.1001-0548.2012.04.022 |