Reveal the hidden layer via entity embedding in traffic prediction

Bo Wang, Khaled Shaaban, Inhi Kim

Research output: Contribution to journalConference articleOther

Abstract

The neural network-based models have been widely used in traffic prediction. They have improved accuracy and efficiency in traffic flow, speed, passenger flow, and delay. Many variables are considered to predict traffic indicators and good techniques for choosing the most influenced variables to results have been developed. Since the neural network models treat independent variables as continuous variables, there are few studies on the use of categorical variables. In addition, the neural network has been criticized as the internal relationships of hidden layers are generally unknown. This paper investigates neural networks to predict the use of bike-sharing systems in Suzhou, China considering a large amount of categorical data. Two methods here, Entity embedding and one-hot encoding are applied. The comparison experiments verify that the entity embedding method is more efficient than one-hot encoding. Furthermore, the hidden layers are visually analyzed by t-SNE, and the relationships with time, weather, surroundings and other variables for the traffic volume at shared bike sites are discussed. The research results show that: 1. Entity embedding can effectively increase the continuity of categorical variables and therefore, improve the prediction efficiency for the neural network models. 2. The relationship between variables can be identified through visual analysis, and the trained embedding vectors can also be used to supervise clustering.

Original languageEnglish
Pages (from-to)163-170
Number of pages8
JournalProcedia Computer Science
Volume151
DOIs
Publication statusPublished - 1 Jan 2019
EventInternational Conference on Ambient Systems, Networks and Technologies (ANT) 2019 - Leuven, Belgium
Duration: 29 Apr 20192 May 2019
Conference number: 10th

Keywords

  • Entity embedding
  • Neural networks
  • One-hot encoding
  • Traffic prediction
  • Visualization

Cite this

@article{adaf07b3d5c94781ab2495deade7f58c,
title = "Reveal the hidden layer via entity embedding in traffic prediction",
abstract = "The neural network-based models have been widely used in traffic prediction. They have improved accuracy and efficiency in traffic flow, speed, passenger flow, and delay. Many variables are considered to predict traffic indicators and good techniques for choosing the most influenced variables to results have been developed. Since the neural network models treat independent variables as continuous variables, there are few studies on the use of categorical variables. In addition, the neural network has been criticized as the internal relationships of hidden layers are generally unknown. This paper investigates neural networks to predict the use of bike-sharing systems in Suzhou, China considering a large amount of categorical data. Two methods here, Entity embedding and one-hot encoding are applied. The comparison experiments verify that the entity embedding method is more efficient than one-hot encoding. Furthermore, the hidden layers are visually analyzed by t-SNE, and the relationships with time, weather, surroundings and other variables for the traffic volume at shared bike sites are discussed. The research results show that: 1. Entity embedding can effectively increase the continuity of categorical variables and therefore, improve the prediction efficiency for the neural network models. 2. The relationship between variables can be identified through visual analysis, and the trained embedding vectors can also be used to supervise clustering.",
keywords = "Entity embedding, Neural networks, One-hot encoding, Traffic prediction, Visualization",
author = "Bo Wang and Khaled Shaaban and Inhi Kim",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.procs.2019.04.025",
language = "English",
volume = "151",
pages = "163--170",
journal = "Procedia Computer Science",
issn = "1877-0509",
publisher = "Elsevier",

}

Reveal the hidden layer via entity embedding in traffic prediction. / Wang, Bo; Shaaban, Khaled; Kim, Inhi.

In: Procedia Computer Science, Vol. 151, 01.01.2019, p. 163-170.

Research output: Contribution to journalConference articleOther

TY - JOUR

T1 - Reveal the hidden layer via entity embedding in traffic prediction

AU - Wang, Bo

AU - Shaaban, Khaled

AU - Kim, Inhi

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The neural network-based models have been widely used in traffic prediction. They have improved accuracy and efficiency in traffic flow, speed, passenger flow, and delay. Many variables are considered to predict traffic indicators and good techniques for choosing the most influenced variables to results have been developed. Since the neural network models treat independent variables as continuous variables, there are few studies on the use of categorical variables. In addition, the neural network has been criticized as the internal relationships of hidden layers are generally unknown. This paper investigates neural networks to predict the use of bike-sharing systems in Suzhou, China considering a large amount of categorical data. Two methods here, Entity embedding and one-hot encoding are applied. The comparison experiments verify that the entity embedding method is more efficient than one-hot encoding. Furthermore, the hidden layers are visually analyzed by t-SNE, and the relationships with time, weather, surroundings and other variables for the traffic volume at shared bike sites are discussed. The research results show that: 1. Entity embedding can effectively increase the continuity of categorical variables and therefore, improve the prediction efficiency for the neural network models. 2. The relationship between variables can be identified through visual analysis, and the trained embedding vectors can also be used to supervise clustering.

AB - The neural network-based models have been widely used in traffic prediction. They have improved accuracy and efficiency in traffic flow, speed, passenger flow, and delay. Many variables are considered to predict traffic indicators and good techniques for choosing the most influenced variables to results have been developed. Since the neural network models treat independent variables as continuous variables, there are few studies on the use of categorical variables. In addition, the neural network has been criticized as the internal relationships of hidden layers are generally unknown. This paper investigates neural networks to predict the use of bike-sharing systems in Suzhou, China considering a large amount of categorical data. Two methods here, Entity embedding and one-hot encoding are applied. The comparison experiments verify that the entity embedding method is more efficient than one-hot encoding. Furthermore, the hidden layers are visually analyzed by t-SNE, and the relationships with time, weather, surroundings and other variables for the traffic volume at shared bike sites are discussed. The research results show that: 1. Entity embedding can effectively increase the continuity of categorical variables and therefore, improve the prediction efficiency for the neural network models. 2. The relationship between variables can be identified through visual analysis, and the trained embedding vectors can also be used to supervise clustering.

KW - Entity embedding

KW - Neural networks

KW - One-hot encoding

KW - Traffic prediction

KW - Visualization

UR - http://www.scopus.com/inward/record.url?scp=85071915666&partnerID=8YFLogxK

U2 - 10.1016/j.procs.2019.04.025

DO - 10.1016/j.procs.2019.04.025

M3 - Conference article

VL - 151

SP - 163

EP - 170

JO - Procedia Computer Science

JF - Procedia Computer Science

SN - 1877-0509

ER -