Parallelizing degraded read for erasure coded cloud storage systems using collective communications

Peng Li, Xingtong Jin, Rebecca J. Stones, Gang Wang, Zhongwei Li, Xiaoguang Liu, Mingming Ren

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

5 Citations (Scopus)

Abstract

For lower storage costs, storage systems are increasingly transitioning to the use of erasure codes instead of replication. However, the increase in the amount of data to be read and transferred during recovery for an erasure-coded system results in the problem of high degraded read latency. We design a new parallel degraded read method, Collective Reconstruction Read, which aims to overcome the problem of high degraded read latency of erasure coding by utilizing parallel reconstruction. By introducing collective communication operations (e.g. all-to-one reduction and all-to-all reduction) into distributed storage systems, data reading, transferring and decoding are preformed by all of the involved data nodes in parallel rather than the client itself. Therefore, the time complexity of the degraded read operation is reduced from linear time to logarithmic time. We implement Collective Reconstruction Read in HDFSRAID and evaluate it as the block size and stripe size vary. We find that these algorithms can reduce degraded read latency significantly, thereby improving system availability. Specifically, experimental results indicate an approximate 55% to 81% round off drop in degraded read latency.

Original languageEnglish
Title of host publication2016 IEEE Trustcom/BigDataSE/ISPA
Subtitle of host publicationTianjin, China, 23-26 August, 2016, [Proceedings]
EditorsYang Xiang, Kui Ren, Dengguo Feng
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1272-1279
Number of pages8
ISBN (Electronic)9781509032051
ISBN (Print)9781509032068
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventIEEE International Symposium on Parallel and Distributed Processing with Applications 2016 - Tianjin, China
Duration: 23 Aug 201626 Aug 2016
Conference number: 14th
https://ieeexplore.ieee.org/xpl/conhome/7845250/proceeding (Proceedings)

Conference

ConferenceIEEE International Symposium on Parallel and Distributed Processing with Applications 2016
Abbreviated titleISPA 2016
CountryChina
CityTianjin
Period23/08/1626/08/16
Internet address

Cite this