A highly efficient data locality aware task scheduler for cloud-based systems

Ru Jia, Yun Yang, John Grundy, Jacky Keung, Hao Li

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

1 Citation (Scopus)

Abstract

Scheduling tasks in the vicinity of stored data can significantly diminish network traffic. Scheduling optimisation can improve data locality by attempting to locate a task and its related data on the same node. Existing schedulers tend to ignore overhead and tradeoff between data transfer and task placement, and bandwidth consumption, by only emphasising data locality without considering other factors. We present a novel data locality aware scheduler for balancing time consumption and network bandwidth traffic-DLAforBT-to improve data locality for tasks and throughput, with the optimal placement policy exhibiting a threshold-based structure. DLAforBT uses bipartite graph modelling to represent data placement, adopts a judgment mechanism and a precise prediction model to determine moving data or moving computation. It integrates an improved Dominant Resource Fairness (DRF) resource allocation to capture tenants' resource allocation and run as many jobs as possible. DLAforBT improves by 16% of data locality rate, and 25% of throughput.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services
EditorsElisa Bertino, Carl K. Chang, Peter Chen, Ernesto Damiani, Michael Goul, Katsunori Oyama
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages496-498
Number of pages3
ISBN (Electronic)9781728127057, 9781728127040
ISBN (Print)9781728127064
DOIs
Publication statusPublished - 2019
EventInternational Conference on Cloud Computing 2019 - Milan, Italy
Duration: 8 Jul 201913 Jul 2019
Conference number: 12th
https://conferences.computer.org/cloud/2019/

Conference

ConferenceInternational Conference on Cloud Computing 2019
Abbreviated titleCLOUD 2019
CountryItaly
CityMilan
Period8/07/1913/07/19
Internet address

Keywords

  • Bipar tite graph modelling
  • Cloud computing
  • Data locality
  • Multi-tenancy
  • Scheduling

Cite this

Jia, R., Yang, Y., Grundy, J., Keung, J., & Li, H. (2019). A highly efficient data locality aware task scheduler for cloud-based systems. In E. Bertino, C. K. Chang, P. Chen, E. Damiani, M. Goul, & K. Oyama (Eds.), Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services (pp. 496-498). [8814565] Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/CLOUD.2019.00089
Jia, Ru ; Yang, Yun ; Grundy, John ; Keung, Jacky ; Li, Hao. / A highly efficient data locality aware task scheduler for cloud-based systems. Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services. editor / Elisa Bertino ; Carl K. Chang ; Peter Chen ; Ernesto Damiani ; Michael Goul ; Katsunori Oyama. Piscataway NJ USA : IEEE, Institute of Electrical and Electronics Engineers, 2019. pp. 496-498
@inproceedings{f8f38dcbeafe41fb93dcc9f66d634ba5,
title = "A highly efficient data locality aware task scheduler for cloud-based systems",
abstract = "Scheduling tasks in the vicinity of stored data can significantly diminish network traffic. Scheduling optimisation can improve data locality by attempting to locate a task and its related data on the same node. Existing schedulers tend to ignore overhead and tradeoff between data transfer and task placement, and bandwidth consumption, by only emphasising data locality without considering other factors. We present a novel data locality aware scheduler for balancing time consumption and network bandwidth traffic-DLAforBT-to improve data locality for tasks and throughput, with the optimal placement policy exhibiting a threshold-based structure. DLAforBT uses bipartite graph modelling to represent data placement, adopts a judgment mechanism and a precise prediction model to determine moving data or moving computation. It integrates an improved Dominant Resource Fairness (DRF) resource allocation to capture tenants' resource allocation and run as many jobs as possible. DLAforBT improves by 16{\%} of data locality rate, and 25{\%} of throughput.",
keywords = "Bipar tite graph modelling, Cloud computing, Data locality, Multi-tenancy, Scheduling",
author = "Ru Jia and Yun Yang and John Grundy and Jacky Keung and Hao Li",
year = "2019",
doi = "10.1109/CLOUD.2019.00089",
language = "English",
isbn = "9781728127064",
pages = "496--498",
editor = "Elisa Bertino and Chang, {Carl K.} and Peter Chen and Ernesto Damiani and Michael Goul and Katsunori Oyama",
booktitle = "Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States of America",

}

Jia, R, Yang, Y, Grundy, J, Keung, J & Li, H 2019, A highly efficient data locality aware task scheduler for cloud-based systems. in E Bertino, CK Chang, P Chen, E Damiani, M Goul & K Oyama (eds), Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services., 8814565, IEEE, Institute of Electrical and Electronics Engineers, Piscataway NJ USA, pp. 496-498, International Conference on Cloud Computing 2019, Milan, Italy, 8/07/19. https://doi.org/10.1109/CLOUD.2019.00089

A highly efficient data locality aware task scheduler for cloud-based systems. / Jia, Ru; Yang, Yun; Grundy, John; Keung, Jacky; Li, Hao.

Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services. ed. / Elisa Bertino; Carl K. Chang; Peter Chen; Ernesto Damiani; Michael Goul; Katsunori Oyama. Piscataway NJ USA : IEEE, Institute of Electrical and Electronics Engineers, 2019. p. 496-498 8814565.

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

TY - GEN

T1 - A highly efficient data locality aware task scheduler for cloud-based systems

AU - Jia, Ru

AU - Yang, Yun

AU - Grundy, John

AU - Keung, Jacky

AU - Li, Hao

PY - 2019

Y1 - 2019

N2 - Scheduling tasks in the vicinity of stored data can significantly diminish network traffic. Scheduling optimisation can improve data locality by attempting to locate a task and its related data on the same node. Existing schedulers tend to ignore overhead and tradeoff between data transfer and task placement, and bandwidth consumption, by only emphasising data locality without considering other factors. We present a novel data locality aware scheduler for balancing time consumption and network bandwidth traffic-DLAforBT-to improve data locality for tasks and throughput, with the optimal placement policy exhibiting a threshold-based structure. DLAforBT uses bipartite graph modelling to represent data placement, adopts a judgment mechanism and a precise prediction model to determine moving data or moving computation. It integrates an improved Dominant Resource Fairness (DRF) resource allocation to capture tenants' resource allocation and run as many jobs as possible. DLAforBT improves by 16% of data locality rate, and 25% of throughput.

AB - Scheduling tasks in the vicinity of stored data can significantly diminish network traffic. Scheduling optimisation can improve data locality by attempting to locate a task and its related data on the same node. Existing schedulers tend to ignore overhead and tradeoff between data transfer and task placement, and bandwidth consumption, by only emphasising data locality without considering other factors. We present a novel data locality aware scheduler for balancing time consumption and network bandwidth traffic-DLAforBT-to improve data locality for tasks and throughput, with the optimal placement policy exhibiting a threshold-based structure. DLAforBT uses bipartite graph modelling to represent data placement, adopts a judgment mechanism and a precise prediction model to determine moving data or moving computation. It integrates an improved Dominant Resource Fairness (DRF) resource allocation to capture tenants' resource allocation and run as many jobs as possible. DLAforBT improves by 16% of data locality rate, and 25% of throughput.

KW - Bipar tite graph modelling

KW - Cloud computing

KW - Data locality

KW - Multi-tenancy

KW - Scheduling

UR - http://www.scopus.com/inward/record.url?scp=85072308889&partnerID=8YFLogxK

U2 - 10.1109/CLOUD.2019.00089

DO - 10.1109/CLOUD.2019.00089

M3 - Conference Paper

SN - 9781728127064

SP - 496

EP - 498

BT - Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services

A2 - Bertino, Elisa

A2 - Chang, Carl K.

A2 - Chen, Peter

A2 - Damiani, Ernesto

A2 - Goul, Michael

A2 - Oyama, Katsunori

PB - IEEE, Institute of Electrical and Electronics Engineers

CY - Piscataway NJ USA

ER -

Jia R, Yang Y, Grundy J, Keung J, Li H. A highly efficient data locality aware task scheduler for cloud-based systems. In Bertino E, Chang CK, Chen P, Damiani E, Goul M, Oyama K, editors, Proceedings - 2019 IEEE International Conference on Cloud Computing - IEEE CLOUD 2019 - Part of the 2019 IEEE World Congress on Services. Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. 2019. p. 496-498. 8814565 https://doi.org/10.1109/CLOUD.2019.00089