An evaluation of data stream processing systems for data driven applications

Jonathan Samosir, Maria Indrawan-Santiago, Pari Delir Haghighi

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    23 Citations (Scopus)

    Abstract

    Real-time data stream processing technologies play an important role in enabling time-critical decision making in many applications. This paper aims at evaluating the performance of platforms that are capable of processing streaming data. Candidate technologies include Storm, Samza, and Spark Streaming. To form the recommendation, a prototype pipeline is designed and implemented in each of the platforms using data collected from sensors used in monitoring heavy-haul railway systems. Through the testing and evaluation of each candidate platform, using both quantitative and qualitative metrics, the paper describes the findings, where Storm is found to be the most appropriate candidate.

    Original languageEnglish
    Title of host publicationInternational Conference on Computational Science 2016, ICCS 2016
    Subtitle of host publication6-8 June 2016, San Diego, California, USA
    EditorsIlkay Altintas, Michael Norman, Jack Dongarra, Valeria V. Krzhizhanovskaya, Michael Lees, Peter M. A. Sloot
    Place of PublicationAmsterdam, Netherlands
    PublisherElsevier
    Pages439-449
    Number of pages11
    DOIs
    Publication statusPublished - 2016
    EventInternational Conference on Computational Science 2016 - San Diego, United States of America
    Duration: 6 Jun 20168 Jun 2016
    Conference number: 16th
    https://www.iccs-meeting.org/iccs2016/

    Conference

    ConferenceInternational Conference on Computational Science 2016
    Abbreviated titleICCS 2016
    CountryUnited States of America
    CitySan Diego
    Period6/06/168/06/16
    Internet address

    Keywords

    • Big data
    • Hadoop ecosystems
    • Real-time data stream processing
    • Samza
    • Spark
    • Storm

    Cite this