TY - JOUR
T1 - kNNVWC
T2 - An efficient k-nearest neighbors approach based on various-widths clustering
AU - Almalawi, Abdul Mohsen
AU - Fahad, Adil
AU - Tari, Zahir
AU - Cheema, Muhammad Aamir
AU - Khalil, Ibrahim
PY - 2016
Y1 - 2016
N2 - The k-nearest neighbor approach (kNN) has been extensively used as a powerful non-parametric technique in many scientific and engineering applications. However, this approach incurs a large computational cost. Hence, this issue has become an active research field. In this work, a novel kNN approach based on various-widths clustering, named kNNVWC, to efficiently find kNNs for a query object from a given data set, is presented. kNNVWC does clustering using various widths, where a data set is clustered with a global width first and each produced cluster that meets the predefined criteria is recursively clustered with its own local width that suits its distribution. This reduces the clustering time, in addition to balancing the number of produced clusters and their respective sizes. Maximum efficiency is achieved by using triangle inequality to prune unlikely clusters. Experimental results demonstrate that kNNVWC performs well in finding kNNs for query objects compared to a number of kNN search algorithms, especially for a data set with high dimensions, various distributions and large size.
AB - The k-nearest neighbor approach (kNN) has been extensively used as a powerful non-parametric technique in many scientific and engineering applications. However, this approach incurs a large computational cost. Hence, this issue has become an active research field. In this work, a novel kNN approach based on various-widths clustering, named kNNVWC, to efficiently find kNNs for a query object from a given data set, is presented. kNNVWC does clustering using various widths, where a data set is clustered with a global width first and each produced cluster that meets the predefined criteria is recursively clustered with its own local width that suits its distribution. This reduces the clustering time, in addition to balancing the number of produced clusters and their respective sizes. Maximum efficiency is achieved by using triangle inequality to prune unlikely clusters. Experimental results demonstrate that kNNVWC performs well in finding kNNs for query objects compared to a number of kNN search algorithms, especially for a data set with high dimensions, various distributions and large size.
KW - Clustering
KW - K-nearest neighbour
KW - High dimensionality
KW - Performance
KW - SCADA
UR - https://www.scopus.com/pages/publications/84961615405
U2 - 10.1109/TKDE.2015.2460735
DO - 10.1109/TKDE.2015.2460735
M3 - Article
SN - 1041-4347
VL - 28
SP - 68
EP - 81
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 1
M1 - 7166319
ER -