Context. Generating massive sets of end-to-end simulations of time-ordered data for Monte Carlo analyses in cosmic microwave background (CMB) experiments typically incurs exceedingly high computational costs. Aims. To address this challenge, we introduce a novel, fast, and efficient generative model built upon scattering covariances, the most recent iteration of the scattering transform statistics. This model is designed to augment by several orders of magnitude the number of map simulations in datasets of computationally expensive CMB instrumental systematics simulations, including their non-Gaussian and inhomogeneous features. Unlike conventional neural network-based algorithms, this generative model requires only a minimal number of training samples, making it highly compatible with the computational constraints of typical CMB simulation campaigns. While our primary focus is on spherical data, the framework is inherently versatile and readily applicable to 1D and 2D planar data, leveraging the localized nature of scattering statistics. Methods. We validated the method using realistic simulations of CMB systematics, which are particularly challenging to emulate, and performed extensive statistical tests to confirm its ability to produce new statistically independent approximate realizations. Results. Remarkably, even when trained on as few as ten simulations, the emulator closely reproduces key summary statistics including the angular power spectrum, scattering coefficients, and Minkowski functionals a and provides pixel covariance estimates with substantially reduced sample noise compared to those obtained without augmentation. Conclusions. The proposed approach has the potential to shift the paradigm in simulation campaign design. Instead of producing large numbers of low- or medium-accuracy simulations, future pipelines can focus on generating a few high-accuracy simulations that are then efficiently augmented using such a generative model. This promises significant benefits not only for current and forthcoming cosmological surveys such as Planck, LiteBIRD, Simons Observatory, CMB-S4, Euclid, and Rubin-LSST, but also for diverse fields including oceanography and climate science. We make both the general framework for scattering transform statistics, HealpixML, and the emulator, CMBSCAT, available to the community.
From few to many maps: A fast map-level emulator for extreme augmentation of cosmic microwave background systematics datasets
Campeti, P.
Primo
;Pagano, L.;Lattanzi, M.Penultimo
;Gerbino, M.Ultimo
2025
Abstract
Context. Generating massive sets of end-to-end simulations of time-ordered data for Monte Carlo analyses in cosmic microwave background (CMB) experiments typically incurs exceedingly high computational costs. Aims. To address this challenge, we introduce a novel, fast, and efficient generative model built upon scattering covariances, the most recent iteration of the scattering transform statistics. This model is designed to augment by several orders of magnitude the number of map simulations in datasets of computationally expensive CMB instrumental systematics simulations, including their non-Gaussian and inhomogeneous features. Unlike conventional neural network-based algorithms, this generative model requires only a minimal number of training samples, making it highly compatible with the computational constraints of typical CMB simulation campaigns. While our primary focus is on spherical data, the framework is inherently versatile and readily applicable to 1D and 2D planar data, leveraging the localized nature of scattering statistics. Methods. We validated the method using realistic simulations of CMB systematics, which are particularly challenging to emulate, and performed extensive statistical tests to confirm its ability to produce new statistically independent approximate realizations. Results. Remarkably, even when trained on as few as ten simulations, the emulator closely reproduces key summary statistics including the angular power spectrum, scattering coefficients, and Minkowski functionals a and provides pixel covariance estimates with substantially reduced sample noise compared to those obtained without augmentation. Conclusions. The proposed approach has the potential to shift the paradigm in simulation campaign design. Instead of producing large numbers of low- or medium-accuracy simulations, future pipelines can focus on generating a few high-accuracy simulations that are then efficiently augmented using such a generative model. This promises significant benefits not only for current and forthcoming cosmological surveys such as Planck, LiteBIRD, Simons Observatory, CMB-S4, Euclid, and Rubin-LSST, but also for diverse fields including oceanography and climate science. We make both the general framework for scattering transform statistics, HealpixML, and the emulator, CMBSCAT, available to the community.| File | Dimensione | Formato | |
|---|---|---|---|
|
aa54540-25.pdf
accesso aperto
Descrizione: Full text editoriale
Tipologia:
Full text (versione editoriale)
Licenza:
Creative commons
Dimensione
6.93 MB
Formato
Adobe PDF
|
6.93 MB | Adobe PDF | Visualizza/Apri |
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


