Ganga: A tool for computational-task management and easy access to Grid resources

J. T. Mościcki, F. Brochu, J. Ebke, U. Egede, J. Elmsheuser, K. Harrison, R. W.L. Jones, H. C. Lee, D. Liko, A. Maier, A. Muraru, G. N. Patrick, K. Pajchel, W. Reece, B. H. Samset, M. W. Slater, A. Soroko, C. L. Tan, D. C. van der Ster, M. Williams

Research output: Contribution to journalArticleResearchpeer-review

78 Citations (Scopus)

Abstract

In this paper, we present the computational task-management tool Ganga, which allows for the specification, submission, bookkeeping and post-processing of computational tasks on a wide set of distributed resources. Ganga has been developed to solve a problem increasingly common in scientific projects, which is that researchers must regularly switch between different processing systems, each with its own command set, to complete their computational tasks. Ganga provides a homogeneous environment for processing data on heterogeneous resources. We give examples from High Energy Physics, demonstrating how an analysis can be developed on a local system and then transparently moved to a Grid system for processing of all available data. Ganga has an API that can be used via an interactive interface, in scripts, or through a GUI. Specific knowledge about types of tasks or computational resources is provided at run-time through a plugin system, making new developments easy to integrate. We give an overview of the Ganga architecture, give examples of current use, and demonstrate how Ganga can be used in many different areas of science. Program summary: Program title: Ganga. Catalogue identifier: AEEN_v1_0. Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEEN_v1_0.html. Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland. Licensing provisions: GPL. No. of lines in distributed program, including test data, etc.: 224 590. No. of bytes in distributed program, including test data, etc.: 14 365 315. Distribution format: tar.gz. Programming language: Python. Computer: personal computers, laptops. Operating system: Linux/Unix. RAM: 1 MB. Classification: 6.2, 6.5. Nature of problem: Management of computational tasks for scientific applications on heterogenous distributed systems, including local, batch farms, opportunistic clusters and Grids. Solution method: High-level job management interface, including command line, scripting and GUI components. Restrictions: Access to the distributed resources depends on the installed, 3rd party software such as batch system client or Grid user interface.

Original languageEnglish
Pages (from-to)2303-2316
Number of pages14
JournalComputer Physics Communications
Volume180
Issue number11
DOIs
Publication statusPublished - 1 Nov 2009
Externally publishedYes

Keywords

  • Application configuration
  • Data mining
  • Grid computing
  • Interoperability
  • System integration
  • Task management
  • User interface

Cite this