On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems

Mabel Gonzalez, Christoph Bergmeir, Isaac Triguero, Yanet Rodriguez, Jose M Benitez

    Research output: Contribution to journalArticleResearchpeer-review

    Abstract

    Positive unlabeled time series classification has become an important area during the last decade, as often vast amounts of unlabeled time series data are available but obtaining the corresponding labels is difficult. In this situation, positive unlabeled learning is a suitable option to mitigate the lack of labeled examples. In particular, selftraining is a widely used technique due to its simplicity and adaptability. Within this technique, the stopping criterion, i.e., the decision of when to stop labeling, is a critical part, especially in the positive unlabeled context. We propose a selftraining method that follows the positive unlabeled approach for time series classification and a family of parameter-free stopping criteria for this method. Our proposal uses a graphical analysis, applied to the minimum distances obtained by the kNearest Neighbor as the base learner, to estimate the class boundary. The proposed method is evaluated in an experimental study involving various time series classification datasets. The results show that our method outperforms the transductive results obtained by previous models.
    Original languageEnglish
    Pages (from-to)42 - 59
    Number of pages18
    JournalInformation Sciences
    Volume328
    DOIs
    Publication statusPublished - 2016

    Cite this

    Gonzalez, Mabel ; Bergmeir, Christoph ; Triguero, Isaac ; Rodriguez, Yanet ; Benitez, Jose M. / On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. In: Information Sciences. 2016 ; Vol. 328. pp. 42 - 59.
    @article{00b3d77f0b8041dba1e53519f80fa48b,
    title = "On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems",
    abstract = "Positive unlabeled time series classification has become an important area during the last decade, as often vast amounts of unlabeled time series data are available but obtaining the corresponding labels is difficult. In this situation, positive unlabeled learning is a suitable option to mitigate the lack of labeled examples. In particular, selftraining is a widely used technique due to its simplicity and adaptability. Within this technique, the stopping criterion, i.e., the decision of when to stop labeling, is a critical part, especially in the positive unlabeled context. We propose a selftraining method that follows the positive unlabeled approach for time series classification and a family of parameter-free stopping criteria for this method. Our proposal uses a graphical analysis, applied to the minimum distances obtained by the kNearest Neighbor as the base learner, to estimate the class boundary. The proposed method is evaluated in an experimental study involving various time series classification datasets. The results show that our method outperforms the transductive results obtained by previous models.",
    author = "Mabel Gonzalez and Christoph Bergmeir and Isaac Triguero and Yanet Rodriguez and Benitez, {Jose M}",
    year = "2016",
    doi = "10.1016/j.ins.2015.07.061",
    language = "English",
    volume = "328",
    pages = "42 -- 59",
    journal = "Information Sciences",
    issn = "0020-0255",
    publisher = "Elsevier",

    }

    On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. / Gonzalez, Mabel; Bergmeir, Christoph ; Triguero, Isaac; Rodriguez, Yanet; Benitez, Jose M.

    In: Information Sciences, Vol. 328, 2016, p. 42 - 59.

    Research output: Contribution to journalArticleResearchpeer-review

    TY - JOUR

    T1 - On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems

    AU - Gonzalez, Mabel

    AU - Bergmeir, Christoph

    AU - Triguero, Isaac

    AU - Rodriguez, Yanet

    AU - Benitez, Jose M

    PY - 2016

    Y1 - 2016

    N2 - Positive unlabeled time series classification has become an important area during the last decade, as often vast amounts of unlabeled time series data are available but obtaining the corresponding labels is difficult. In this situation, positive unlabeled learning is a suitable option to mitigate the lack of labeled examples. In particular, selftraining is a widely used technique due to its simplicity and adaptability. Within this technique, the stopping criterion, i.e., the decision of when to stop labeling, is a critical part, especially in the positive unlabeled context. We propose a selftraining method that follows the positive unlabeled approach for time series classification and a family of parameter-free stopping criteria for this method. Our proposal uses a graphical analysis, applied to the minimum distances obtained by the kNearest Neighbor as the base learner, to estimate the class boundary. The proposed method is evaluated in an experimental study involving various time series classification datasets. The results show that our method outperforms the transductive results obtained by previous models.

    AB - Positive unlabeled time series classification has become an important area during the last decade, as often vast amounts of unlabeled time series data are available but obtaining the corresponding labels is difficult. In this situation, positive unlabeled learning is a suitable option to mitigate the lack of labeled examples. In particular, selftraining is a widely used technique due to its simplicity and adaptability. Within this technique, the stopping criterion, i.e., the decision of when to stop labeling, is a critical part, especially in the positive unlabeled context. We propose a selftraining method that follows the positive unlabeled approach for time series classification and a family of parameter-free stopping criteria for this method. Our proposal uses a graphical analysis, applied to the minimum distances obtained by the kNearest Neighbor as the base learner, to estimate the class boundary. The proposed method is evaluated in an experimental study involving various time series classification datasets. The results show that our method outperforms the transductive results obtained by previous models.

    UR - http://goo.gl/V8AT7u

    U2 - 10.1016/j.ins.2015.07.061

    DO - 10.1016/j.ins.2015.07.061

    M3 - Article

    VL - 328

    SP - 42

    EP - 59

    JO - Information Sciences

    JF - Information Sciences

    SN - 0020-0255

    ER -