BoF: Analyzing Parallel I/O

Parallel application I/O performance often does not meet user expectations. Additionally, slight access pattern modifications may lead to significant changes in performance due to complex interactions between hardware and software. These challenges call for sophisticated tools to capture, analyze, understand, and tune application I/O.

In this BoF, we will highlight recent advances in monitoring tools to help address this problem. We will also encourage community discussion to compare best practices, identify gaps in measurement and analysis, and find ways to translate parallel I/O analysis into actionable outcomes for users, facility operators, and researchers.

The BoF is held in conjunction with the Supercomputing conference. The official schedule is listed here.

Date Tuesday, November 13th, 2018
Time 5:15pm - 6:45pm
Venue Room C155/156, Dallas, USA

The BoF is powered by the Virtual Institute for I/O and ESiWACE 1).

The BoF is organized by

We have a series of talks followed by a longer discussion:

  • Introduction – Phil Carns (Argonne National Laboratory)
    Slides
  • Using Benchmarks to Understand Performance Behavior – Julian Kunkel (University of Reading)
    Slides
  • A Peek into Workflow I/O – Jakob Lüttgau (DKRZ)
    Slides
  • I/O profiling in the field with Ellexus – Rosemary Francis (Ellexus)
    Slides
  • Monitoring of I/O with Lustre – Andreas Dilger (WhamCloud)
    Slides
  • IO Workload Throttling on Supercomputers – Si Liu (TACC)
    Slides
  • Panel and discussion

Phil Carns – Phil Carns is a principal software development specialist in the Mathematics and Computer Science Division of Argonne National Laboratory. He is also an adjunct associate professor of electrical and computer engineering at Clemson University and a fellow of the Northwestern-Argonne Institute for Science and Engineering. He received his Ph.D. in computer engineering from Clemson University in 2005. Phil's research interests include characterization of I/O patterns, simulation of large-scale storage systems, and design of high-performance system software.

Julian Kunkel – Dr. Kunkel is a Lecturer at the Computer Science Department at the University of Reading. He manages several research projects revolving around High-Performance Computing and particularly high-performance storage. Besides his main goal to provide efficient and performance-portable I/O, his HPC-related interests are: data reduction techniques, performance analysis of parallel applications and parallel I/O, management of cluster systems, cost-efficiency considerations, and software engineering of scientific software.

Jakob Lüttgau – Jakob Luettgau is a researcher at the German Climate Computing Center (DKRZ) and a Ph.D. student at the University of Hamburg, with a focus on the modeling, analysis and architecture of I/O systems for HPC.

Rosemary Francis – Dr Rosemary Francis is the founder and CEO of Ellexus, the I/O profiling company. Rosemary obtained her PhD in Computer Architecture from the Cambridge University Computer Lab. After working in the chip design industry, Rosemary founded Ellexus to help HPC administrators and users take control of the way they access data.

Andreas Dilger – Andreas has been involved in Lustre since its inception. From early prototypes in 2000, though several companies over the next nineteen years, Andreas has been one of the lead Lustre developers. After joining Intel in 2012, he became Lustre Principal Architect, and is now Lustre CTO at Whamcloud/DDN.

Si Liu – Dr. Si Liu is now working in the High-Performance Computing group at the Texas Advanced Computing Center. He manages the HPC Applications group. He conducts research on HPC applications and tools, and provides the science and engineering community with superior user experience. He is also collaborating with academic and industrial institutions all over the world on various data-intensive research projects that demand advanced HPC technology and cyberinfrastructure.

1)
ESiWACE has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 675191