Abstract and Applied Analysis
Volume 2013 (2013), Article ID 196256, 6 pages
http://dx.doi.org/10.1155/2013/196256
Research Article

A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

School of Computer and Information Technology, Liaoning Normal University, No. 1, Liushu South Street, Ganjingzi, Dalian, Liaoning 116081, China

Received 28 December 2012; Accepted 25 March 2013

Academic Editor: Jianhong (Cecilia) Xia

Copyright © 2013 Yong Zhang and Dapeng Wang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods.