In the last years, we have seen a growing interest of the scientific community first and commercial vendors then, in new technologies like Grid and Cloud computing. The first in particular, was born to meet the enormous computational requests mostly coming from physic experiments, especially Large Hadron Collider's (LHC) experiments at Conseil Européen pour la Recherche Nucléaire (European Laboratory for Particle Physics) (CERN) in Geneva. Other scientific disciplines that are also benefiting from those technologies are biology, astronomy, earth sciences, life sciences, etc. Grid systems allow the sharing of heterogeneous computational and storage resources between different geographically distributed institutes, agencies or universities. For this purpose technologies have been developed to allow communication, authentication, storing and processing of the required software and scientific data. This allows different scientific communities the access to computational resources that a single institute could not host for logistical and cost reasons. Grid systems were not the only answer to this growing need of resources of different communities. At the same time, in the last years, we have seen the affirmation of the so called Cloud Computing. Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g.: networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. The use of both computational paradigms and the utilization of storage resources, leverage on different authentication and authorization tools. The utilization of those technologies requires systems for the accounting of the consumed resources. Those systems are built on the top of the existing infrastructure and they collect all the needed data related to the users, groups and resources utilized. This information is then collected in central repositories where they can be analyzed and aggregated. Open Grid Forum (OGF) is the international organism that works to develop standards in the Grid environment. Usage Record - Working Group (UR-WG) is a group, born within OGF aiming at standardizing the Usage Record (UR) structure and publication for different kinds of resources. Up to now, the emphasis has been on the accounting for computational resources. With time it came out the need to expand those concepts to other aspects and especially to a definition and implementation of a standard UR for storage accounting. Several extensions to the UR definition are proposed in this thesis and the proposed developments in this field are described. The Distributed Grid Accounting System (DGAS) has been chosen, among other tools available, as the accounting system for the Italian Grid and is also adopted in other countries such as Greece and Germany. Together with HLRmon, it offers a complete accounting system and it is the tool that has been used during the writing of the thesis at INFN-CNAF. • In Chapter 1, I will focus on the paradigm of distributed computing and the Grid infrastructure will be introduced with particular emphasis on the gLite middleware and the EGI-InSPIRE project. • In Chapter 2, I will discuss some Grid accounting systems for computational resources with particular stress for DGAS. • In Chapter 3, the cross-check monitoring system used to check the correctness of the gathered data at the INFN-CNAF's Tier1 is presented. • In Chapter 4, another important aspect on accounting, accounting for storage resources, is introduced and the definition of a standard UR for storage accounting is presented. • In Chapter 5, an implementation of a new accounting system for the storage that uses the definitions given in Chapter 4 is presented. • In Chapter 6, the focus of the thesis move on the performance and reliability tests performed on the latest development release of DGAS that implements ActiveMQ as a standard transport mechanism. • In Appendix A are collected the BASH scripts and SQL code that are part of the cross-check tool described in Chapter 3. • In Appendix B are collected the scripts used in the implementation of the accounting system described in Chapter 5. • In Appendix C are collected the scripts and configurations used for the tests of the ActiveMQ implementation of DGAS described in Chapter 6. • In Appendix D are collected the publications in which I contributed during the thesis work

Grid accounting for computing and storage resources towards standardization

CRISTOFORI, Andrea
2011

Abstract

In the last years, we have seen a growing interest of the scientific community first and commercial vendors then, in new technologies like Grid and Cloud computing. The first in particular, was born to meet the enormous computational requests mostly coming from physic experiments, especially Large Hadron Collider's (LHC) experiments at Conseil Européen pour la Recherche Nucléaire (European Laboratory for Particle Physics) (CERN) in Geneva. Other scientific disciplines that are also benefiting from those technologies are biology, astronomy, earth sciences, life sciences, etc. Grid systems allow the sharing of heterogeneous computational and storage resources between different geographically distributed institutes, agencies or universities. For this purpose technologies have been developed to allow communication, authentication, storing and processing of the required software and scientific data. This allows different scientific communities the access to computational resources that a single institute could not host for logistical and cost reasons. Grid systems were not the only answer to this growing need of resources of different communities. At the same time, in the last years, we have seen the affirmation of the so called Cloud Computing. Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g.: networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. The use of both computational paradigms and the utilization of storage resources, leverage on different authentication and authorization tools. The utilization of those technologies requires systems for the accounting of the consumed resources. Those systems are built on the top of the existing infrastructure and they collect all the needed data related to the users, groups and resources utilized. This information is then collected in central repositories where they can be analyzed and aggregated. Open Grid Forum (OGF) is the international organism that works to develop standards in the Grid environment. Usage Record - Working Group (UR-WG) is a group, born within OGF aiming at standardizing the Usage Record (UR) structure and publication for different kinds of resources. Up to now, the emphasis has been on the accounting for computational resources. With time it came out the need to expand those concepts to other aspects and especially to a definition and implementation of a standard UR for storage accounting. Several extensions to the UR definition are proposed in this thesis and the proposed developments in this field are described. The Distributed Grid Accounting System (DGAS) has been chosen, among other tools available, as the accounting system for the Italian Grid and is also adopted in other countries such as Greece and Germany. Together with HLRmon, it offers a complete accounting system and it is the tool that has been used during the writing of the thesis at INFN-CNAF. • In Chapter 1, I will focus on the paradigm of distributed computing and the Grid infrastructure will be introduced with particular emphasis on the gLite middleware and the EGI-InSPIRE project. • In Chapter 2, I will discuss some Grid accounting systems for computational resources with particular stress for DGAS. • In Chapter 3, the cross-check monitoring system used to check the correctness of the gathered data at the INFN-CNAF's Tier1 is presented. • In Chapter 4, another important aspect on accounting, accounting for storage resources, is introduced and the definition of a standard UR for storage accounting is presented. • In Chapter 5, an implementation of a new accounting system for the storage that uses the definitions given in Chapter 4 is presented. • In Chapter 6, the focus of the thesis move on the performance and reliability tests performed on the latest development release of DGAS that implements ActiveMQ as a standard transport mechanism. • In Appendix A are collected the BASH scripts and SQL code that are part of the cross-check tool described in Chapter 3. • In Appendix B are collected the scripts used in the implementation of the accounting system described in Chapter 5. • In Appendix C are collected the scripts and configurations used for the tests of the ActiveMQ implementation of DGAS described in Chapter 6. • In Appendix D are collected the publications in which I contributed during the thesis work
LUPPI, Eleonora
RUGGIERO, Valeria
File in questo prodotto:
File Dimensione Formato  
364.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Non specificato
Dimensione 6.7 MB
Formato Adobe PDF
6.7 MB Adobe PDF Visualizza/Apri

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2389368
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact