CloudBATCH: A batch job queuing system on clouds with hadoop and HBase

Chen Zhang, Hans De Sterck

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

17 Citations (Scopus)

Abstract

As MapReduce becomes more and more popular in data processing applications, the demand for Hadoop clusters grows. However, Hadoop is incompatible with existing cluster batch job queuing systems and requires a dedicated cluster under its full control. Hadoop also lacks support for user access control, accounting, fine-grain performance monitoring and legacy batch job processing facilities comparable to existing cluster job queuing systems, making dedicated Hadoop clusters less amenable for administrators and normal users alike with hybrid computing needs involving both MapReduce and legacy applications. As a result, getting a properly suited and sized Hadoop cluster has not been easy in organizations with existing clusters. This paper presents CloudBATCH, a prototype solution to this problem enabling Hadoop to function as a traditional batch job queuing system with enhanced functionality for cluster resource management. With CloudBATCH, a complete shift to Hadoop for managing an entire cluster to cater for hybrid computing needs becomes feasible.

Original languageEnglish
Title of host publicationProceedings - 2nd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2010
Pages368-375
Number of pages8
DOIs
Publication statusPublished - 2010
Externally publishedYes
EventIEEE International Conference on Cloud Computing Technology and Science 2010 - Indianapolis, United States of America
Duration: 30 Nov 20103 Dec 2010
Conference number: 2nd

Conference

ConferenceIEEE International Conference on Cloud Computing Technology and Science 2010
Abbreviated titleCloudCom 2010
Country/TerritoryUnited States of America
CityIndianapolis
Period30/11/103/12/10

Cite this