A new speculative execution algorithm based on C4.5 decision tree for Hadoop

Yuanzhen Li, Qun Yang, Shangqi Lai, Bohan Li

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

2 Citations (Scopus)

Abstract

As a distributed computing platform, Hadoop provides an effective way to handle big data. In Hadoop, the completion time of job will be delayed by a straggler. Although the definitive cause of the straggler is hard to detect, speculative execution is usually used for dealing with this problem, by simply backing up those stragglers on alternative nodes. In this paper, we design a new Speculative Execution algorithm based on C4.5 Decision Tree, SECDT, for Hadoop. In SECDT, we speculate completion time of stragglers and also of backup tasks, based on a kind of decision tree method: C4.5 decision tree. After we speculate the completion time, we compare the completion time of stragglers and of the backup tasks, calculating their differential value, and selecting the straggler with the maximum differential value to start the backup task. Experiment result shows that the SECDT can predict execution time more accurately than other speculative execution methods, hence reduce the job completion time.

Original languageEnglish
Title of host publicationIntelligent Computation in Big Data Era - International Conference of Young Computer Scientists, Engineers and Educators, ICYCSEE 2015, Proceedings
EditorsHongzhi Wang, Wanxiang Che, Zhaowen Qiu, Zhongyuan Han, Junyu Lin, Haoliang Qi, Zeguang Lin, Leilei Kong
PublisherSpringer
Pages284-291
Number of pages8
ISBN (Electronic)9783662462478
Publication statusPublished - 2015
Externally publishedYes
EventInternational Conference of Young Computer Scientists, Engineers and Educators 2015 - Harbin, China
Duration: 10 Jan 201512 Jan 2015
https://link.springer.com/book/10.1007/978-3-662-46248-5 (Proceedings)

Publication series

NameIFIP Advances in Information and Communication Technology
Volume503
ISSN (Print)1868-4238

Conference

ConferenceInternational Conference of Young Computer Scientists, Engineers and Educators 2015
Abbreviated titleICYCSEE 2015
Country/TerritoryChina
CityHarbin
Period10/01/1512/01/15
Internet address

Keywords

  • C4.5 decision tree
  • Hadoop
  • Speculative execution

Cite this