Testing the resiliency of complex IT services deployed in hybrid Cloud scenarios is a challenging task that requires expensive and possibly destructive operations. An interesting approach lies in Chaos Engineering, a set of practices to test the resiliency of software systems running in a production environment. However, Chaos Engineering is an expensive practice that requires the setup of complicated operations that further increase the complexity of management operations. To reduce this complexity, Chaos Engineering can benefit from the adoption of non-destructive approaches such as the definition of realistic digital twins. A digital twin is a virtual replica of a real-system on which experimenting with management configurations. This paper embraces this research avenue by extending our previous efforts to integrate Chaos Engineering techniques into an IT services management framework called ChaosTwin. ChaosTwin leverages novel methodologies and tools capable of identifying and promptly react to unexpected failures. Finally, to implement autonomous fault management, ChaosTwin defines scaling and migration policies that can quickly explore for more resilient placements of software components in case of system failures. We believe that ChaosTwin can provide useful guidance to service providers in finding cost-effective service configurations capable of minimizing the negative effects of unpredictable events.

A Chaos Engineering Approach for Improving the Resiliency of IT Services Configurations

Poltronieri, Filippo
Primo
;
Tortonesi, Mauro
Secondo
;
Stefanelli, Cesare
Ultimo
2022

Abstract

Testing the resiliency of complex IT services deployed in hybrid Cloud scenarios is a challenging task that requires expensive and possibly destructive operations. An interesting approach lies in Chaos Engineering, a set of practices to test the resiliency of software systems running in a production environment. However, Chaos Engineering is an expensive practice that requires the setup of complicated operations that further increase the complexity of management operations. To reduce this complexity, Chaos Engineering can benefit from the adoption of non-destructive approaches such as the definition of realistic digital twins. A digital twin is a virtual replica of a real-system on which experimenting with management configurations. This paper embraces this research avenue by extending our previous efforts to integrate Chaos Engineering techniques into an IT services management framework called ChaosTwin. ChaosTwin leverages novel methodologies and tools capable of identifying and promptly react to unexpected failures. Finally, to implement autonomous fault management, ChaosTwin defines scaling and migration policies that can quickly explore for more resilient placements of software components in case of system failures. We believe that ChaosTwin can provide useful guidance to service providers in finding cost-effective service configurations capable of minimizing the negative effects of unpredictable events.
2022
9781665406017
Chaos Engineering; Cloud Computing; Digital Twin; Optimization; Service Management
File in questo prodotto:
File Dimensione Formato  
A_Chaos_Engineering_Approach_for_Improving_the_Resiliency_of_IT_Services_Configurations (1).pdf

solo gestori archivio

Descrizione: Full text editoriale
Tipologia: Full text (versione editoriale)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2501908
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 3
social impact