One-versus-all (OVA) classifiers learn k individual binary classifiers, each distinguishing the instances of a single class from the instances of all other classes. To classify a new instance, the k classifiers are run, and the one that returns the highest confidence is chosen. Thus, OVA is different from existing data stream classification schemes whose majority use multiclass classifiers, each discriminating among all the classes. This paper advocates some outstanding advantages of OVA for data stream classification. First, there is low error correlation and, hence, high diversity among OVA's component classifiers, which leads to high classification accuracy. Second, OVA is adept at accommodating new class labels that often appear in data streams. However, there also remain many challenges to deploy traditional OVA for classifying data streams. First, traditional OVA does not handle concept change, a key feature of data streams. Second, as every instance is fed to all component classifiers, OVA is known as an inefficient model. Third, OVA's classification accuracy is adversely affected by the imbalanced class distributions in data streams. This paper addresses those key challenges and consequently proposes a new OVA scheme that is adapted for data stream classification. Theoretical analysis and empirical evidence reveal that the adapted OVA can offer faster training, faster updating, and higher classification accuracy than many existing popular data stream classification algorithms. We expect these results to be of interest to researchers and practitioners because they suggest a simple but very elegant and effective alternative to existing classification schemes for data streams.
|Number of pages||14|
|Journal||IEEE Transactions on Knowledge and Data Engineering|
|Publication status||Published - 2009|
- Mining data streams
- One-versus-all classifiers