Workshop on Storage Challenges in the UK

The Special Interest Group of High-Performance I/O in the UK (SIG IO UK – Join our mailing list) is organising this workshop to bring together users of data-intensive workloads, storage vendors, and system and middleware developers. During the workshop, the participants should be able to identify common challenges, establish strategies for research and funding, and also the development of products.

The workshop series generally covers all aspects of data access and management, including I/O workflow handling, parallel file systems, middleware, tuning, performance monitoring, novel interfaces, storage technology, and data centre perspectives. The workshop is primarily organised as a series of talks and group discussion slots.

This years' workshop focuses on exchanging information about challenges and the ongoing efforts to overcome them, aiming to bring forward RD&E and the use of storage systems (in the UK). Users present their challenges dealing with I/O and ongoing RD&E to overcome them. Vendor talks focus on technical solutions for specific challenges accompanied by use cases demonstrating the benefit.

Registration deadline February 20th, 2020
Date Thursday 23rd, April, 2020
Venue Virtual meeting using BlackBoard Collaborate
Contact Dr. Julian Kunkel

This workshop is sponsored by DDN and powered by the Virtual Institute for I/O and ESiWACE 1).

The SIG-IO-UK is a loose consortium of people and institutions that share a common interest in High-Performance I/O. SIG exchanges information revolving around the topic covering, but not limited to, I/O challenges, RD&E, and solutions on the market. We develop whitepapers raising awareness in the importance of I/O, describing state of the art in the UK, aim to derive research and funding opportunities but also aid product development.

SIG IO UK – Join our mailing list

The workshop is held as a virtual event and we will record the presentations.

The videos are available in the YouTube playlist.

  • 10:00 Welcome – Julian Kunkel
    Slides
  • 10:05 Toward Next Generation Interfaces for Exploiting Workflows – Julian Kunkel
    The efficient, convenient, and robust execution of data-driven workflows is key for productivity in scientific computing. However, managing IO workflows efficiently in data centers is challenging for users. In order to achieve the best IO performance, a user must consider the characteristics of the various available storage systems and file systems and map the IO of their workflows manually to the storage systems. This talk introduces a wider datacenter vision on the handling of workflows and then our general vision the abstraction bears to the typical IO stack in the domain of climate and weather. Within ESiWACE, we move toward this vision by integrating capabilities into the software stack that increases the abstraction level of IO within workflows and enables the software stack to optimize data placement and handle the information lifecycle automatically. We then show our design for the ESiWACE prototype that utilizes software components such as Cylc, SLURM, ESDM, and XIOS. Next, we discuss how such an approach could be integrated into existing workflows. Finally, the Next-Generation Interfaces effort is introduced.
    Slides
  • 10:30 Persistent memory for I/O – Adrian Jackson
    This talk presents the work done during the EU funded NEXTGenIO project, looking at enabling exascale I/O for computational simulation, data analytics, and machine learning. We focus on non-volatile, in-node, memory, how it can be used, and the performance benefits and usage pitfalls that it provides.
    Slides
  • 11:00 Sage2: Addressing Exascale Storage Challenges – Sai Narasimhamurthy
    Slides
  • 11:30 Interoperability as the new frontier – Jean-Thomas Acquaviva
    As illustrated by IO500 benchmark in recent years HPC storage systems have considerably push the boundaries of performance. Not only from a quantitative standpoint with unprecedented throughput or capacity numbers but also in qualitative terms with the ability to deal with a much larger spectrum of workloads. Such progress, driven by new workloads such as AI are only of part of the answer. Nowadays performance equation has to integrated the complexity of the environment, storage systems are not used only as output for huge numerical simulations but integrated within the complex work and data flows. Therefore, the new frontier is about interoperability and the ability for a storage solution to be seamlessly integrated with other data hubs such as clouds either private or public. Our talk will discuss the evolution of the raw performances and will open-up on the on-going efforts for a better integration within modern data flows. This is the ransom of success and the way HPC technologies will carve their way to mainstream environments.
    Slides
  • 12:00 The Benefits and Challenges of Elastic In Situ Analysis and Visualization – Matthieu Dorier
    Slides - Paper
  • 12:30 Virtual lunch break
  • 13:00 Storage at DLS - One Size fits no one – Frederik Ferner
    Slides
  • 13:30 ECMWF’s Exascale IO challenges — From inside the HPC to a whole Data archive migration – Tiago Quintino
    Slides
  • 14:00 Semantic storage of climate data on object store – Neil Massey
    Slides
  • 14:20 Exploiting Heterogeneous Resource Utilization for Scientific Workflows – Erdem Yilmaz
    Slides
  • 14:40 Discussion: Supporting the I/O needs for applications (specifically in the UK)
  • 15:00 Virtual coffee break / discussion
  • 15:30 Memory vs. Storage Software and Hardware: The Shifting Landscape – Jay Lofstead
    New memory technologies, such as the persistent memory modules supported by the latest Intel chips offer a persistent storage device accessible on the memory bus. NVMe and other high-performance storage devices offer extreme performance at increasingly affordable prices. With these technology shifts, what software we use for different tasks and what we can afford to do is changing. This talk will explore how things are changing and where new opportunities to enable science exist.
    Slides
  • 16:00 Application IO analysis with Lustre Monitoring using LASSi for ARCHER – Karthee Sivalingam
    Slides
  • 16:30 Classifying Temporal Characteristics of Job I/O Patterns Using Machine Learning Techniques – Eugen Betke
    Slides
  • 17:00 Discussion: Hot topics
    Results
  • 17:30 Farewell

Send an email to Julian Kunkel for registration. Check that you get a confirmation.

Speakers must check the University of Reading - External Speaker Code of Conduct and comply with these rules.

1)
ESiWACE2 is funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No. 823988.