I/O performance of the Santos Dumont supercomputer

 

Programa: 
meteorologia
Primeiro Autor: 
Jean Luca Bez
Ano de Publicação: 
2020
Nome da Revista/Jornal: 
The International Journal of High Performance Computing Applications
Tipo de publicação: 
Artigo publicado em Revista
localidade: 
Publicação Internacional
TítuloI/O performance of the Santos Dumont supercomputer
Tipo da publicaçãoJournal Article
Ano de Publicação2020
AutoresBez JL, Carneiro AR, Pavan PJ, Girelli VS, Boito FZ, Fagundes BA, Osthoff C, da Silva Dias PL, Méhaut J-F, Navaux POA
JournalThe International Journal of High Performance Computing Applications
Volume34
Issue2
Paginação227 - 245
Data de Publicação03/2020
ISSN1741-2846
Resumo

In this article, we study the I/O performance of the Santos Dumont supercomputer, since the gap between processing and data access speeds causes many applications to spend a large portion of their execution on I/O operations. For a large-scale expensive supercomputer, it is essential to ensure applications achieve the best I/O performance to promote efficient usage. We monitor a week of the machine’s activity and present a detailed study on the obtained metrics, aiming at providing an understanding of its workload. From experiences with one numerical simulation, we identified large I/O performance differences between the MPI implementations available to users. We investigated the phenomenon and narrowed it down to collective I/O operations with small request sizes. For these, we concluded that the customized MPI implementation by the machine’s vendor (used by more than 20% of the jobs) presents the worst performance. By investigating the issue, we provide information to help improve future MPI-IO collective write implementations and practical guidelines to help users and steer future system upgrades. Finally, we discuss the challenge of describing applications I/O behavior without depending on information from users. That allows for identifying the application’s I/O bottlenecks and proposing ways of improving its I/O performance. We propose a methodology to do so, and use GROMACS, the application with the largest number of jobs in 2017, as a case study.

URLhttps://journals.sagepub.com/doi/10.1177/1094342019868526
DOI10.1177/1094342019868526
Short TitleThe International Journal of High Performance Computing Applications