TY - JOUR
T1 - Long-term traffic speed prediction utilizing data augmentation via segmented time frame clustering
AU - Chan, Robin Kuok Cheong
AU - Lim, Joanne Mun-Yee
AU - Parthiban, Rajendran
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2025/1/10
Y1 - 2025/1/10
N2 - Among many traffic forecasting studies, comparatively fewer studies focus on long-term traffic prediction, such as 24-hour prediction. While traffic data such as traffic speed are easier to obtain, obtaining similarly reliable and accessible feature data with the inclusion of weather or events would be difficult depending on the location or availability of the service providers. Getting these data becomes a more significant issue when considering global coverage. To mitigate the issue of limited feature data, a method to augment already existing data by improving the dataset's quality and ensuring more accurate training via sorting the dataset into appropriate clusters to be used as an additional feature is proposed. This paper proposes a long-term traffic forecasting model that utilizes a novel time-series segmentation method paired with data clustering and classification via Convolutional Neural Network (CNN) to cover the lack of traffic data and features as additional pre-processing before using Long Short-Term Memory (LSTM) for long-term traffic prediction which is not researched as much. This proposed model is called Cluster Augmented LSTM (CAL). The proposed model is compared with existing machine learning models and evaluated using Mean Absolute Percentage Error (MAPE) and Root-Mean-Squared-Error (RMSE) performance metrics. A comparison between LSTM and Gated Recurrent Units (GRU) was conducted, showing that GRU tends to outperform LSTM in most cases. However, the best-performing result for the proposed method still utilizes LSTM. The final results show that the proposed CAL model could achieve better results by 1.42 %-1.76 % and 0.25–0.41 for MAPE and RMSE, respectively.
AB - Among many traffic forecasting studies, comparatively fewer studies focus on long-term traffic prediction, such as 24-hour prediction. While traffic data such as traffic speed are easier to obtain, obtaining similarly reliable and accessible feature data with the inclusion of weather or events would be difficult depending on the location or availability of the service providers. Getting these data becomes a more significant issue when considering global coverage. To mitigate the issue of limited feature data, a method to augment already existing data by improving the dataset's quality and ensuring more accurate training via sorting the dataset into appropriate clusters to be used as an additional feature is proposed. This paper proposes a long-term traffic forecasting model that utilizes a novel time-series segmentation method paired with data clustering and classification via Convolutional Neural Network (CNN) to cover the lack of traffic data and features as additional pre-processing before using Long Short-Term Memory (LSTM) for long-term traffic prediction which is not researched as much. This proposed model is called Cluster Augmented LSTM (CAL). The proposed model is compared with existing machine learning models and evaluated using Mean Absolute Percentage Error (MAPE) and Root-Mean-Squared-Error (RMSE) performance metrics. A comparison between LSTM and Gated Recurrent Units (GRU) was conducted, showing that GRU tends to outperform LSTM in most cases. However, the best-performing result for the proposed method still utilizes LSTM. The final results show that the proposed CAL model could achieve better results by 1.42 %-1.76 % and 0.25–0.41 for MAPE and RMSE, respectively.
KW - Convolutional neural network
KW - Long short-term memory
KW - Long-term time series forecasting
KW - Traffic behavior
KW - Traffic prediction
UR - http://www.scopus.com/inward/record.url?scp=85210098390&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2024.112785
DO - 10.1016/j.knosys.2024.112785
M3 - Article
AN - SCOPUS:85210098390
SN - 0950-7051
VL - 308
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 112785
ER -