A hybrid missing data imputation method for constructing city mobility indices

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

An effective missing data imputation method is essential for data mining and knowledge discovery from a comprehensive database with missing values. This paper proposes a new hybrid imputation method to effectively deal with the missing data issue of the Mobility in Cities Database (MCD) to construct city mobility indices. The hybrid method integrates the advantages of decision trees and fuzzy clustering into an iterative algorithm for missing data imputation. Extensive experiments conducted on the MCD and three commonly used datasets demonstrate that the hybrid method outperforms other existing effective imputation methods. With the MCD’s missing values imputed by the hybrid method, and using factor analysis and principal component analysis, this paper constructs city mobility indices for 63 cities in the MCD based on the novel concept of city mobility supply and demand. The city mobility indices constructed under a hierarchical structure of mobility supply and demand indicators represent substantial city mobility knowledge discovered from mining the MCD. The proposed hybrid method represents a significant contribution to missing data imputation research.

Original languageEnglish
Title of host publicationData Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers
EditorsRafiqul Islam, Yun Sing Koh, Yanchang Zhao, Graco Warwick, David Stirling, Chang-Tsun Li, Zahidul Islam
Place of PublicationSingapore Singapore
PublisherSpringer
Pages135-148
Number of pages14
ISBN (Electronic)9789811366611
ISBN (Print)9789811366604
DOIs
Publication statusPublished - 2019
EventAustralasian Data Mining Conference 2018 - Bathurst, Australia
Duration: 28 Nov 201830 Nov 2018
Conference number: 16th
https://web.archive.org/web/20181122224709/https://ausdm18.ausdm.org/

Publication series

NameCommunications in Computer and Information Science
PublisherSpringer
Volume996
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

ConferenceAustralasian Data Mining Conference 2018
Abbreviated titleAusDM 2018
CountryAustralia
CityBathurst
Period28/11/1830/11/18
Internet address

Keywords

  • City mobility index
  • Decision tree
  • Factor analysis
  • Iterative fuzzy clustering
  • Missing data imputation
  • Principal component analysis

Cite this

Nikfalazar, S., Yeh, C-H., Bedingfield, S. E., & Akbarzadeh Khorshidi, H. (2019). A hybrid missing data imputation method for constructing city mobility indices. In R. Islam, Y. S. Koh, Y. Zhao, G. Warwick, D. Stirling, C-T. Li, & Z. Islam (Eds.), Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers (pp. 135-148). (Communications in Computer and Information Science; Vol. 996). Singapore Singapore: Springer. https://doi.org/10.1007/978-981-13-6661-1_11
Nikfalazar, Sanaz ; Yeh, Chung-Hsing ; Bedingfield, Susan Elizabeth ; Akbarzadeh Khorshidi, Hadi. / A hybrid missing data imputation method for constructing city mobility indices. Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers. editor / Rafiqul Islam ; Yun Sing Koh ; Yanchang Zhao ; Graco Warwick ; David Stirling ; Chang-Tsun Li ; Zahidul Islam. Singapore Singapore : Springer, 2019. pp. 135-148 (Communications in Computer and Information Science).
@inproceedings{34e46495de1e45de9751bbcaf1374443,
title = "A hybrid missing data imputation method for constructing city mobility indices",
abstract = "An effective missing data imputation method is essential for data mining and knowledge discovery from a comprehensive database with missing values. This paper proposes a new hybrid imputation method to effectively deal with the missing data issue of the Mobility in Cities Database (MCD) to construct city mobility indices. The hybrid method integrates the advantages of decision trees and fuzzy clustering into an iterative algorithm for missing data imputation. Extensive experiments conducted on the MCD and three commonly used datasets demonstrate that the hybrid method outperforms other existing effective imputation methods. With the MCD’s missing values imputed by the hybrid method, and using factor analysis and principal component analysis, this paper constructs city mobility indices for 63 cities in the MCD based on the novel concept of city mobility supply and demand. The city mobility indices constructed under a hierarchical structure of mobility supply and demand indicators represent substantial city mobility knowledge discovered from mining the MCD. The proposed hybrid method represents a significant contribution to missing data imputation research.",
keywords = "City mobility index, Decision tree, Factor analysis, Iterative fuzzy clustering, Missing data imputation, Principal component analysis",
author = "Sanaz Nikfalazar and Chung-Hsing Yeh and Bedingfield, {Susan Elizabeth} and {Akbarzadeh Khorshidi}, Hadi",
year = "2019",
doi = "10.1007/978-981-13-6661-1_11",
language = "English",
isbn = "9789811366604",
series = "Communications in Computer and Information Science",
publisher = "Springer",
pages = "135--148",
editor = "Rafiqul Islam and Koh, {Yun Sing} and Yanchang Zhao and Graco Warwick and David Stirling and Chang-Tsun Li and Zahidul Islam",
booktitle = "Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers",

}

Nikfalazar, S, Yeh, C-H, Bedingfield, SE & Akbarzadeh Khorshidi, H 2019, A hybrid missing data imputation method for constructing city mobility indices. in R Islam, YS Koh, Y Zhao, G Warwick, D Stirling, C-T Li & Z Islam (eds), Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers. Communications in Computer and Information Science, vol. 996, Springer, Singapore Singapore, pp. 135-148, Australasian Data Mining Conference 2018, Bathurst, Australia, 28/11/18. https://doi.org/10.1007/978-981-13-6661-1_11

A hybrid missing data imputation method for constructing city mobility indices. / Nikfalazar, Sanaz; Yeh, Chung-Hsing; Bedingfield, Susan Elizabeth; Akbarzadeh Khorshidi, Hadi.

Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers. ed. / Rafiqul Islam; Yun Sing Koh; Yanchang Zhao; Graco Warwick; David Stirling; Chang-Tsun Li; Zahidul Islam. Singapore Singapore : Springer, 2019. p. 135-148 (Communications in Computer and Information Science; Vol. 996).

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

TY - GEN

T1 - A hybrid missing data imputation method for constructing city mobility indices

AU - Nikfalazar, Sanaz

AU - Yeh, Chung-Hsing

AU - Bedingfield, Susan Elizabeth

AU - Akbarzadeh Khorshidi, Hadi

PY - 2019

Y1 - 2019

N2 - An effective missing data imputation method is essential for data mining and knowledge discovery from a comprehensive database with missing values. This paper proposes a new hybrid imputation method to effectively deal with the missing data issue of the Mobility in Cities Database (MCD) to construct city mobility indices. The hybrid method integrates the advantages of decision trees and fuzzy clustering into an iterative algorithm for missing data imputation. Extensive experiments conducted on the MCD and three commonly used datasets demonstrate that the hybrid method outperforms other existing effective imputation methods. With the MCD’s missing values imputed by the hybrid method, and using factor analysis and principal component analysis, this paper constructs city mobility indices for 63 cities in the MCD based on the novel concept of city mobility supply and demand. The city mobility indices constructed under a hierarchical structure of mobility supply and demand indicators represent substantial city mobility knowledge discovered from mining the MCD. The proposed hybrid method represents a significant contribution to missing data imputation research.

AB - An effective missing data imputation method is essential for data mining and knowledge discovery from a comprehensive database with missing values. This paper proposes a new hybrid imputation method to effectively deal with the missing data issue of the Mobility in Cities Database (MCD) to construct city mobility indices. The hybrid method integrates the advantages of decision trees and fuzzy clustering into an iterative algorithm for missing data imputation. Extensive experiments conducted on the MCD and three commonly used datasets demonstrate that the hybrid method outperforms other existing effective imputation methods. With the MCD’s missing values imputed by the hybrid method, and using factor analysis and principal component analysis, this paper constructs city mobility indices for 63 cities in the MCD based on the novel concept of city mobility supply and demand. The city mobility indices constructed under a hierarchical structure of mobility supply and demand indicators represent substantial city mobility knowledge discovered from mining the MCD. The proposed hybrid method represents a significant contribution to missing data imputation research.

KW - City mobility index

KW - Decision tree

KW - Factor analysis

KW - Iterative fuzzy clustering

KW - Missing data imputation

KW - Principal component analysis

UR - http://www.scopus.com/inward/record.url?scp=85063450033&partnerID=8YFLogxK

U2 - 10.1007/978-981-13-6661-1_11

DO - 10.1007/978-981-13-6661-1_11

M3 - Conference Paper

SN - 9789811366604

T3 - Communications in Computer and Information Science

SP - 135

EP - 148

BT - Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers

A2 - Islam, Rafiqul

A2 - Koh, Yun Sing

A2 - Zhao, Yanchang

A2 - Warwick, Graco

A2 - Stirling, David

A2 - Li, Chang-Tsun

A2 - Islam, Zahidul

PB - Springer

CY - Singapore Singapore

ER -

Nikfalazar S, Yeh C-H, Bedingfield SE, Akbarzadeh Khorshidi H. A hybrid missing data imputation method for constructing city mobility indices. In Islam R, Koh YS, Zhao Y, Warwick G, Stirling D, Li C-T, Islam Z, editors, Data Mining - 16th Australasian Conference, AusDM 2018 Bahrurst, NSW, Australia, November 28–30, 2018 Revised Selected Papers. Singapore Singapore: Springer. 2019. p. 135-148. (Communications in Computer and Information Science). https://doi.org/10.1007/978-981-13-6661-1_11