TY - JOUR
T1 - Ganga
T2 - A tool for computational-task management and easy access to Grid resources
AU - Mościcki, J. T.
AU - Brochu, F.
AU - Ebke, J.
AU - Egede, U.
AU - Elmsheuser, J.
AU - Harrison, K.
AU - Jones, R. W.L.
AU - Lee, H. C.
AU - Liko, D.
AU - Maier, A.
AU - Muraru, A.
AU - Patrick, G. N.
AU - Pajchel, K.
AU - Reece, W.
AU - Samset, B. H.
AU - Slater, M. W.
AU - Soroko, A.
AU - Tan, C. L.
AU - van der Ster, D. C.
AU - Williams, M.
PY - 2009/11/1
Y1 - 2009/11/1
N2 - In this paper, we present the computational task-management tool Ganga, which allows for the specification, submission, bookkeeping and post-processing of computational tasks on a wide set of distributed resources. Ganga has been developed to solve a problem increasingly common in scientific projects, which is that researchers must regularly switch between different processing systems, each with its own command set, to complete their computational tasks. Ganga provides a homogeneous environment for processing data on heterogeneous resources. We give examples from High Energy Physics, demonstrating how an analysis can be developed on a local system and then transparently moved to a Grid system for processing of all available data. Ganga has an API that can be used via an interactive interface, in scripts, or through a GUI. Specific knowledge about types of tasks or computational resources is provided at run-time through a plugin system, making new developments easy to integrate. We give an overview of the Ganga architecture, give examples of current use, and demonstrate how Ganga can be used in many different areas of science. Program summary: Program title: Ganga. Catalogue identifier: AEEN_v1_0. Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEEN_v1_0.html. Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland. Licensing provisions: GPL. No. of lines in distributed program, including test data, etc.: 224 590. No. of bytes in distributed program, including test data, etc.: 14 365 315. Distribution format: tar.gz. Programming language: Python. Computer: personal computers, laptops. Operating system: Linux/Unix. RAM: 1 MB. Classification: 6.2, 6.5. Nature of problem: Management of computational tasks for scientific applications on heterogenous distributed systems, including local, batch farms, opportunistic clusters and Grids. Solution method: High-level job management interface, including command line, scripting and GUI components. Restrictions: Access to the distributed resources depends on the installed, 3rd party software such as batch system client or Grid user interface.
AB - In this paper, we present the computational task-management tool Ganga, which allows for the specification, submission, bookkeeping and post-processing of computational tasks on a wide set of distributed resources. Ganga has been developed to solve a problem increasingly common in scientific projects, which is that researchers must regularly switch between different processing systems, each with its own command set, to complete their computational tasks. Ganga provides a homogeneous environment for processing data on heterogeneous resources. We give examples from High Energy Physics, demonstrating how an analysis can be developed on a local system and then transparently moved to a Grid system for processing of all available data. Ganga has an API that can be used via an interactive interface, in scripts, or through a GUI. Specific knowledge about types of tasks or computational resources is provided at run-time through a plugin system, making new developments easy to integrate. We give an overview of the Ganga architecture, give examples of current use, and demonstrate how Ganga can be used in many different areas of science. Program summary: Program title: Ganga. Catalogue identifier: AEEN_v1_0. Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEEN_v1_0.html. Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland. Licensing provisions: GPL. No. of lines in distributed program, including test data, etc.: 224 590. No. of bytes in distributed program, including test data, etc.: 14 365 315. Distribution format: tar.gz. Programming language: Python. Computer: personal computers, laptops. Operating system: Linux/Unix. RAM: 1 MB. Classification: 6.2, 6.5. Nature of problem: Management of computational tasks for scientific applications on heterogenous distributed systems, including local, batch farms, opportunistic clusters and Grids. Solution method: High-level job management interface, including command line, scripting and GUI components. Restrictions: Access to the distributed resources depends on the installed, 3rd party software such as batch system client or Grid user interface.
KW - Application configuration
KW - Data mining
KW - Grid computing
KW - Interoperability
KW - System integration
KW - Task management
KW - User interface
UR - http://www.scopus.com/inward/record.url?scp=70149092310&partnerID=8YFLogxK
U2 - 10.1016/j.cpc.2009.06.016
DO - 10.1016/j.cpc.2009.06.016
M3 - Article
AN - SCOPUS:70149092310
VL - 180
SP - 2303
EP - 2316
JO - Computer Physics Communications
JF - Computer Physics Communications
SN - 0010-4655
IS - 11
ER -