Generating synthetic time series to augment sparse datasets

Germain Forestier, Francois Petitjean, Hoang Anh Dau, Geoffrey I. Webb, Eamonn Keogh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

149 Citations (Scopus)

Abstract

In machine learning, data augmentation is the process of creating synthetic examples in order to augment a dataset used to learn a model. One motivation for data augmentation is to reduce the variance of a classifier, thereby reducing error. In this paper, we propose new data augmentation techniques specifically designed for time series classification, where the space in which they are embedded is induced by Dynamic Time Warping (DTW). The main idea of our approach is to average a set of time series and use the average time series as a new synthetic example. The proposed methods rely on an extension of DTW Barycentric Averaging (DBA), the averaging technique that is specifically developed for DTW. In this paper, we extend DBA to be able to calculate a weighted average of time series under DTW. In this case, instead of each time series contributing equally to the final average, some can contribute more than others. This extension allows us to generate an infinite number of new examples from any set of given time series. To this end, we propose three methods that choose the weights associated to the time series of the dataset. We carry out experiments on the 85 datasets of the UCR archive and demonstrate that our method is particularly useful when the number of available examples is limited (e.g. 2 to 6 examples per class) using a 1-NN DTW classifier. Furthermore, we show that augmenting full datasets is beneficial in most cases, as we observed an increase of accuracy on 56 datasets, no effect on 7 and a slight decrease on only 22.

Original languageEnglish
Title of host publicationProceedings
Subtitle of host publication17th IEEE International Conference on Data Mining
EditorsVijay Raghavan, Srinivas Aluru, George Karypis, Lucio Miele, Xindong Wu
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages865-870
Number of pages6
ISBN (Print)9781538638347
DOIs
Publication statusPublished - 15 Dec 2017
EventIEEE International Conference on Data Mining 2017 - New Orleans, United States of America
Duration: 18 Nov 201721 Nov 2017
Conference number: 17th
http://icdm2017.bigke.org/
https://ieeexplore.ieee.org/xpl/conhome/8211002/proceeding (Proceedings)

Conference

ConferenceIEEE International Conference on Data Mining 2017
Abbreviated titleICDM 2017
Country/TerritoryUnited States of America
CityNew Orleans
Period18/11/1721/11/17
Internet address

Keywords

  • Data augmentation
  • Dynamic time warping
  • Time series classification

Cite this