HPC-IODC: HPC I/O in the Data Center Workshop

Abstract

Managing scientific data at a large scale is challenging for both scientists and the host data centre.

The storage and file systems deployed within a data centre are expected to meet users' requirements for data integrity and high performance across heterogeneous and concurrently running applications.

With new storage technologies and layers in the memory hierarchy, the picture is becoming even murkier. To effectively manage the data load within a data centre, I/O experts must understand how users expect to use the storage and what services they should provide to enhance user productivity.

In this workshop, we bring together I/O experts from data centres and application workflows to share current practices for scientific workflows, issues, and obstacles for both hardware and the software stack, and R&D to overcome these issues. We seek to ensure that a systems-level perspective is included in these discussions.

The workshop content is built on the tracks with calls for papers/talks:

Research paper track – Requesting submissions regarding state-of-the-practice and research about I/O in the data centre (see our topic list).
Talks from I/O experts – Requesting submissions of talks.
Student Mentoring Sessions

We are excited to announce that research papers will be published in Springer LNCS open access and extended manuscripts in the Journal of High-Performance Storage as well. Contributions to both tracks are peer-reviewed and require submission of the respective research paper or idea for your presentation via Easychair (see the complete description in Track: Research Papers).

The workshop is held in conjunction with the ISC-HPC during the ISC workshop day. Note that attendance at ISC workshops requires a workshop pass. See also our last year's workshop web page.

Date		2024-05-16
Venue		Hamburg, Germany
Contact		Dr. Julian Kunkel

This workshop is powered by the Virtual Institute for I/O and the Journal of High-Performance Storage.

The workshop is organised by

Julian Kunkel (Georg-August-Universität Göttingen/GWDG), julian.kunkel@gwdg.de
Jay Lofstead (Sandia National Lab, USA), gflofst@sandia.gov
Jean-Thomas Acquaviva (DDN, France), jtacquaviva@ddn.com

You must register to attend the workshop.

09:00 Welcome – Julian Kunkel (GWDG/Uni Göttingen), Jay Lofstead (Sandia National Laboratories), Jean-Thomas Acquaviva (DDN)
09:15 Session about storage semantics
- Exploring Data Paths in HPC Systems using the IO500 - Sarah Neuwirth
  Parallel storage systems are sophisticated infrastructures designed to handle immense volumes of data concurrently. Their design is advancing beyond the traditional two-tiered distributed file system and archive model by introducing new tiers of temporary, fast storage close to the computing resources with distinctly different performance characteristics. Major challenges arise in maintaining coherence and minimizing latency in this intricate network, demanding advanced storage architectures, efficient data management, and synchronization techniques. Therefore, it is critical to bridge performance gaps through systematic monitoring and profiling, modeling, optimization, and analysis of the I/O and data paths in HPC systems to meet the colossal data demands of large-scale workloads. This talk will explore the challenges of understanding and analyzing I/O and data paths in modern parallel storage architectures by leveraging the IO500 benchmark and list. In addition, the challenges and problems of modern infrastructures for performance monitoring and profiling will be discussed in order to better understand and optimize extreme-scale I/O in the future
- Uncovering Hidden Performance Metrics from IO500 Benchmark Analysis - Julian Kunkel
- Introducing I/O-Verbs a semantic aware API as POSIX alternative - Sebastian Oeste
10:45 Scalable Storage Competition 2425 - Julian Kunkel
11:00 break
11:30
- Debunking the I/O Myths of Large Language Model Training - Subramanian Kartik and Glenn Lockwood
- Introducing the Metric Proxy for Holistic I/O Measurements - Ahmad Tarraf, Jean-Baptiste Besnard, Ahmad Tarraf, Alberto Cascajo and Sameer Shende
  High-performance computing (HPC) systems face a broad spectrum of I/O patterns from various sources, including workflows, in-situ data operations, and Ad-hoc file storages. However, accurately monitoring these workloads at scale is challenging due to multiple layers' interference on system performance metrics. The metric proxy, developed in the EuroHPC project ADMIRE, addresses this by providing real-time insights into system states, reducing overhead and storage constraints. By utilizing a Tree-Based Overlay Network (TBON) topology, it efficiently collects metrics across nodes in HPC systems. In this talk, we explore the conceptual foundation of the metric proxy and its architectural design. Furthermore, we demonstrate how it can be used with Extra-P for continuous I/O performance modelling and with FTIO for detecting periodic I/O workload patterns, ultimately aiding in more informed system optimization strategies.
- Secure Elasticsearch Clusters on HPC Systems for Sensitive Data - Hendrik Nolte, Lars Quentin and Julian Kunkel
13:00 Lunch break
14:00
- The Use of Object Storage for Advanced, Complex Scientific Research Data Holdings and Workflows at The Pawsey Supercomputing Centre - Chris Schlipalius
  This presentation covers the novel use of object storage deployed on scalable commodity hardware for complex and advanced scientific data workflows involving large amounts of research data. This a new development in data storage as historically at The Pawsey Supercomputing Centre and other HPC centres, only (fast) parallel POSIX filesystems either standalone or as part of a HSM were deployed for these types of data holdings and workflows. Pawsey has deployed at scale, fast, accessible object storage for use with a number of large data projects, historical data archives, new radio astronomy observations and other large scientific data projects. This talk will cover how these object storage services were implemented (based on the design goals and how these were assessed), some examples of what object storage is used for, which includes storage for HPC workflows, and how this compliments the continuing use and sharing of large scientific datasets and data products. Lastly, this talk will include what the future may be for object at The Pawsey Supercomputing Centre and other national HPC centres. This will be described in an interesting, descriptive talk by an experienced speaker working for over twenty years in large scale data storage.
- Correlating File Access Patterns and I/O Metrics of DL Applications Run on HPC systems - Sandra Mendez
  A critical and often challenging aspect of running DL applications in HPC systems is efficiently managing file I/O operations, which are essential for loading data and storing results. Optimal handling of I/O is crucial, particularly when dealing with datasets containing thousands of samples (e.g., thousands of small images) processed by the parallel filesystems of HPC systems. DL applications exhibit I/O patterns that can become bottlenecks, limiting their overall per- formance. Additionally, their scalability on HPC systems may be hindered by prevalent file formats in DL. To understand the I/O impact on DL application performance, we need to understand their I/O behavior in HPC systems. However, due to the complexity of the I/O software stack, it is necessary to provide a method for processing and depicting the performance metrics and patterns in a structured way that can guide users in analyzing the I/O performance. In this context, we will present a method that focuses on analyzing temporal and spatial I/O patterns for different file formats and their distribution on the parallel filesystem, in order to correlate them with the obtained I/O metrics. This allows us to understand the I/O behavior on the HPC system, identifying and selecting data access and distribution patterns that minimize the impact of these patterns on the application’s I/O performance.
- Update on DAOS - Adrian Jackson
- Dynamic Provisioning of Parallel File System Instances - A Path Forward for HPC? - Paul Nowoczynski (Niova)
  Intentionally or not, Cloud Service Providers (CSPs) have shown a compelling way forward which could alleviate long standing problems faced by users and maintainers of large parallel file systems. Today, systems such as LustreFSX and Azure Managed Lustre, both of which are dynamically provisioned, provide solutions which have seemingly been outside the reach of or ignored by HPC centers. Perhaps the most obvious benefits are: workload isolation / performance predictability and reduction of the “blast radius” which limits the reach of system outages. However, the CSP deployment methods have other important advantages such as bounding namespace growth, ability to charge and track usage, enforceable rate limiting, and many others. How are CSPs able to do this? CSP file services, such as LustreFSX, colocate specialized block devices with the PFS service processes onto a VM / hypervisor. From there, a management layer is introduced which handles the high availability of the PFS service processes - this layer is somewhat simplified due to the specialized block devices which are fault tolerant, highly-available, and network addressable. While “block” has been the dirty word in storage circles for many years, we should be honest about the current pain points and recognize that CSP block services are providing means, today, for dealing with many long standing issues. Innovation Beyond the CSPs? If such a block layer were available to HPC providers how could it promote further research and innovation? One compelling area is adaptable and scalable data movement which places data nearest to the application or furthest for cold data where locality is agnostic to the upper file system layer. Another would be the enabling of new parallel I/O systems to come into production more quickly because they would not be required to implement their own fault tolerance and recovery schemes which would be handled by the distributed block layer. Abstracting fault tolerance and recovery from the parallel I/O / file system implementation would greatly reduce the complexity required to implement reliable systems.
16:00 break
16:30
- Analyzing ML workflows using Vampir - Zoya Masih
- TBD - Sebastian Krey
17:30 Discussion
18:00 End

Sandra Mendez holds a PhD in High Performance Computing (HPC) by the Universitat Aut`onoma de Barcelona (UAB, Spain) with research interests in High Performance I/O Systems. Since 2019, she is a Research Associate at the Barcelona Supercomputing Center. Her areas of expertise include performance analysis, evaluation and assessment of monitored I/O-patterns, diagnosing bot- tlenecks in HPC infrastructures and development of strategies for optimizing applications. Furthermore, she is an external researcher at HPC4EAS research group and collaborate as adviser in the PhD Programme in Computer Science in the area of HPC at the UAB.

Dr. Ahmad Tarraf is a senior researcher in the field of computational science, currently working as a postdoc at the Technical University of Darmstadt. Specializing in performance prediction and optimization of High-Performance Computing (HPC) applications within the Laboratory for Parallel Programming, Dr. Tarraf's academic journey began with a B.Sc. in Mechatronics Engineering from RHU Lebanon in 2013, followed by an M.Sc. in Mechatronics Engineering at the Technical University of Darmstadt in 2016. He then worked as a research assistant at the Institute of Computer Science at the University of Frankfurt in 2017, where he explored the formal abstraction and verification of analog mixed circuits. This pursuit culminated in his doctoral degree (Dr. rer. nat.) in Computer Science from the University of Frankfurt in early 2021, marked by academic distinction (Magna Cum Laude). Dr. Tarraf's research interests cover a wide range of areas, including high-performance computing, efficient file and storage systems, behavioral modeling, machine learning, formal verification, analog mixed-signal design, cybernetics, and robotics. He is also involved in national research projects and EuroHPC projects like ADMIRE and DEEP-SEA.

Thomas Bönisch (HLRS)
Matthew Curry (Sandia National Laboratories)
Sandro Fiore (University of Trento)
Javier Garcia Blas (Carlos III University)
Adrian Jackson (The University of Edinburgh)
George S. Markomanolis (AMD)
Sandra Mendez (Barcelona Supercomputing Center (BSC))
Feiyi Wang (Oak Ridge National Laboratory)

The workshop is integrated into ISC-HPC. We welcome everybody to join the workshop, including:

I/O experts from data centres and industry.
Researchers/Engineers working on high-performance I/O for data centres.
Domain scientists and computer scientists interested in discussing I/O issues.
Vendors are also welcome, but their presentations must align with data centre topics (e.g. how do they manage their own clusters) and not focus on commercial aspects.

The call for papers and talks is already open. We accept early submissions and typically proceed with them within 45 days. We particularly encourage early submission of abstracts such that you indicate your interest in submissions.

You may be interested in joining our mailing list at the Virtual Institute for I/O.

We especially welcome participants that are willing to give a presentation about the I/O of the representing institutions' data centre. Note that such presentations should cover the topics mentioned below.

Call for Papers/Contributions (CfP)

The research track accepts papers covering state-of-the-practice and research dedicated to storage in the data centre.

Proceedings will appear in ISC's post-conference workshop proceedings in Springers LNCS. Extended versions have a chance for acceptance in the first issue of the JHPS journal. We will apply the more restrictive review criteria from JHPS and use the open workflow of the JHPS journal for managing the proceedings. For interaction, we will rely on Easychair, so please submit the metadata to EasyChair before the deadline.

For the workshop, we accept papers with up to 12 pages (excluding references) in LNCS format. You may already submit an extended version suitable for the JHPS in JHPS format. Upon submission, please indicate potential sections for the extended version (setting a light red background colour). There are two JHPS templates, a LaTeX and a Word template. The JHPS template can be easily converted to the LNCS Word format such that the effort is minimal for the authors to obtain both publications. See the Manuscript Preparation, Layout & Templates, Springer.

For accepted papers, the length of the talk during the workshop depends on the controversiality and novelty of the approach (the length is decided based on the preference provided by the authors and feedback from the reviewers). All relevant work in the area of data centre storage will be published with our joint workshop proceedings. We just believe the available time should be used best to discuss controversial topics.

The relevant topics for papers cover all aspects of data centre I/O, including:

Application workflows
User productivity and costs
Performance monitoring
Dealing with heterogeneous storage
Data management aspects
Archiving and long term data management
State-of-the-practice (e.g., using or optimising a storage system for data centre workloads)
Research that tackles data centre I/O challenges
Cloud/Edge storage aspects
Application of AI methods in storage

2024-03-01: Submission deadline: AoE ¹⁾
- Note: The call for papers and talks is already open.
- We appreciate early submissions of abstracts and full papers and review them within 24 days.
2024-03-29: Author notification
2024-04-30: Pre-final submission for ISC (Papers to be shared during the workshop. We will also use the JHPS papers, if available.)
2024-05-16: Workshop
2024-06-15: Camera-ready papers for ISC ²⁾ – As they are needed for ISC's post-conference workshop proceedings. We embrace the opportunity for authors to improve their papers based on the feedback received during the workshop.
2024-08-24: Camera-ready papers for the extended JHPS paper (It depends on the author's ability to incorporate feedback into their submission in the incubator.)

The main acceptance criterion is the relevance of the approach to be presented, i.e., the core idea is novel and worthwhile to be discussed in the community. Considering that the camera-ready version of the papers is due after the workshop, we pursue two rounds of reviews:

Acceptance for the workshop (as a talk).
Acceptance as a paper *after* the workshop, incorporating feedback from the workshop.

After the first review, all papers undergo a shepherding process.

The criteria for The Journal of High-Performance Storage are described on its webpage.

The topics of interest in this track include, but are not limited to:

A description of the operational aspects of your data centre
A particular solution for specific data centre workloads in production

We also accept industry talks, given that they are focused on operational issues on data centres and omit marketing.

We use Easychair for managing the interaction with the program committee. If you are interested in participating, please submit a short (1/2 page) intended abstract of your talk together with a brief Bio.

Abstract Deadlines

Submission deadline: 2024-04-12 AoE
Author notification: 2024-04-26

Content

The following list of items should be tried to be integrated into a talk covering your data centre, if possible. We hope your site's administrator will support you to gather the information with little effort.

Workload characterisation
1. Scientific Workflow (give a short introduction)
  1. A typical use-case (if multiple are known, feel free to present more)
  2. Involved number of files/amount of data
2. Job mix
  1. Node utilisation (related to peak-performance)
System view
1. Architecture
  1. Schema of the client/server infrastructure
    1. Capacities (Tape, Disk, etc.)
  2. Potential peak-performance of the storage
    1. Theoretical
    2. Optional: Performance results of acceptance tests.
  3. Software/Middleware used, e.g. NetCDF 4.X, HDF5, …
2. Monitoring infrastructure
  1. Tools and systems used to gather and analyse utilisation
3. Actual observed performance in production
  1. Throughput graphs of the storage (e.g., from Ganglia)
  2. Metadata throughput (Ops/s)
4. Files on the storage
  1. Number of files (if possible, per file type)
  2. Distribution of file sizes
Issues/Obstacles
1. Hardware
2. Software
3. Pain points (what is seen as the most significant problem(s) and suggested solutions, if known)
Conducted R&D (that aim to mitigate issues)
1. Future perspective
2. Known or projected future workload characterisation
3. Scheduled hardware upgrades and new capabilities we should focus on exploiting as a community
4. Ideal system characteristics and how it addresses current problems or challenges
5. What hardware should be added
6. What software should be developed to make things work better (capabilities perspective)
7. Items requiring discussion

To foster the next generation of data-related practitioners and researchers, students are encouraged to submit an abstract following the expert talk guidelines above as far as their research is aligned with these topics. At the workshop, the students will be given 10 minutes to talk about what they are working on followed by 10-15 minutes of conversation with the community present about how to further the work, what the impact could be, alternative research directions, and other topics to help the students progress in their studies. We encourage students to work with a shepherd towards a JHPS paper illustrating their research based on the feedback obtained during the workshop.

¹⁾

Anywhere on Earth

²⁾

tentative

HPC-IODC: HPC I/O in the Data Center Workshop

Abstract

Organisation

Agenda

Bios

Program Committee

Participation

Track: Research Papers

Topics

Paper Deadlines

Review Criteria

Track: Expert Talks

Abstract Deadlines

Content

Track: Student Mentoring Sessions