Dependable large scale behavioral patterns mining from sensor data using Hadoop platform

Md. Mamunur Rashid, Iqbal Gondal, Joarder Kamruzzaman

    Research output: Contribution to journalArticleResearchpeer-review

    32 Citations (Scopus)

    Abstract

    Wireless sensor networks (WSNs) will be an integral part of the future Internet of Things (IoT) environment and generate large volumes of data. However, these data would only be of benefit if useful knowledge can be mined from them. A data mining framework for WSNs includes data extraction, storage and mining techniques, and must be efficient and dependable. In this paper, we propose a new type of behavioral pattern mining technique from sensor data called regularly frequent sensor patterns (RFSPs). RFSPs can identify a set of temporally correlated sensors which can reveal significant knowledge from the monitored data. A distributed data extraction model to prepare the data required for mining RFSPs is proposed, as the distributed scheme ensures higher availability through greater redundancy. The tree structure for RFSP is compact requires less memory and can be constructed using only a single scan through the dataset, and the mining technique is efficient with low runtime. Current mining techniques in the literature on sensor data employ a single memory-based sequential approach and hence are not efficient. Moreover, usage of the MapReduce model for the distributed solution has not been explored extensively. Since MapReduce is becoming the de facto model for computation on large data, we also propose a parallel implementation of the RFSP mining algorithm, called RFSP on Hadoop (RFSP-H), which uses a MapReduce-based framework to gain further efficiency. Experiments conducted to evaluate the compactness and performance of the data extraction model, RFSP-tree and RFSP-H mining show improved results.

    Original languageEnglish
    Pages (from-to)128-145
    Number of pages18
    JournalInformation Sciences
    Volume379
    DOIs
    Publication statusPublished - 10 Feb 2017

    Keywords

    • Data mining
    • Frequent pattern
    • Knowledge discovery
    • MapReduce
    • Regularly frequent sensor pattern
    • Wireless sensor networks

    Cite this