The Knights Landing (KNL) is the codename for the latest generation of Intel processors based on Intel Many Integrated Core (MIC) architecture. It relies on massive thread and data parallelism, and fast on-chip memory. This processor operates in standalone mode, booting an off-the-shelf Linux operating system. The KNL peak performance is very high – approximately 3 Tflops in double precision and 6 Tflops in single precision – but sustained performance depends critically on how well all parallel features of the processor are exploited by real-life applications. We assess the performance of this processor for Lattice Boltzmann codes, widely used in computational fluid-dynamics. In our OpenMP code we consider several memory data-layouts that meet the conflicting computing requirements of distinct parts of the application, and sustain a large fraction of peak performance. We make some performance comparisons with other processors and accelerators, and also discuss the impact of the various memory layouts on energy efficiency.
Early experience on using Knights Landing processors for Lattice Boltzmann Applications
Calore, EnricoPrimo
;Gabbana, AlessandroSecondo
;Schifano, Sebastiano Fabio
Penultimo
;Tripiccione, RaffaeleUltimo
2018
Abstract
The Knights Landing (KNL) is the codename for the latest generation of Intel processors based on Intel Many Integrated Core (MIC) architecture. It relies on massive thread and data parallelism, and fast on-chip memory. This processor operates in standalone mode, booting an off-the-shelf Linux operating system. The KNL peak performance is very high – approximately 3 Tflops in double precision and 6 Tflops in single precision – but sustained performance depends critically on how well all parallel features of the processor are exploited by real-life applications. We assess the performance of this processor for Lattice Boltzmann codes, widely used in computational fluid-dynamics. In our OpenMP code we consider several memory data-layouts that meet the conflicting computing requirements of distinct parts of the application, and sustain a large fraction of peak performance. We make some performance comparisons with other processors and accelerators, and also discuss the impact of the various memory layouts on energy efficiency.File | Dimensione | Formato | |
---|---|---|---|
1804.01918.pdf
solo gestori archivio
Descrizione: Pre-print
Tipologia:
Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.78 MB
Formato
Adobe PDF
|
1.78 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Early experience 2018.pdf
solo gestori archivio
Descrizione: Full text editoriale
Tipologia:
Full text (versione editoriale)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.94 MB
Formato
Adobe PDF
|
1.94 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.