BoF: Data-Centric Computing for the Next Generation

The efficient, convenient, and robust execution of data-driven workflows and enhanced data management are key for productivity in scientific computing and computer-aided RD&E. Big data tools integrate compute and storage capabilities into a holistic solution demonstrating the benefit of tight integrating while the HPC community still optimizes the compute and storage components independently from each other, and, moreover, independently from the needs of end-to-end user workflows that ultimately lead to insight. Even within a single data center, utilizing homogeneous storage and compute infrastructure efficiently is complex for experts. The efficient management of data and compute capabilities in a heterogeneous environment, however, is an unresolved question as the execution of individual tasks from workflows may benefit from alternative hardware architectures and infrastructures.

In this BoF, we bring the community together to discuss visions for a data-centric compute environment of the future that gives the fastest time to insight by applying concepts like smart scheduling and compiler technology which, e.g., minimize data movement for the entire workflow and exploit capabilities of heterogeneous environments that stretch beyond a single data center. As this has implications on data-center planning, hardware/software infrastructure starting from a higher-level workflow formulation to smarter hardware and software layers, it affects the wider HPC community. We gathered a range of stakeholders from industry and academia interested in this approach with the ultimate goal is to establish a forum that addresses the need for Next Generation Interfaces that defines and realizes the vision that will impact the next generations of scientists.

Date Wednesday, June 19th 9:30 - 10:30, 2019
Venue Room Konstant, Frankfurt, Germany
Contact Dr. Julian Kunkel

This BoF is powered by the Virtual Institute for I/O and ESiWACE 1).

Please see the official announcement.

The BoF is organized by

Description

First, several speakers from industry and academia prime the audience with visions for the HPC in the future and the proposal for the establishment of the new forum. After each short presentation, a short Q&A takes place followed by surveys that assess the agreement/disagreement of individual aspects and rank the importance of certain features and requirements. We will explicitly invite the attendees to share their considerations of these matters and the approaches they chose to overcome some of the issues in their working environment.

The BoF topic is very inclusive addressing vendors, solution architects, and scientists dealing with computing, or storage in workflows, applications, or in administrative domains such as data centers. Particularly, we are interested in an audience that is critically reflecting on the current state of practice and willing to share their thoughts.

Agenda

  • WelcomeJulian Kunkel (University of Reading)
  • High-Level Workflows – Potential for Innovation? Peeking at the current IO stack.Julian Kunkel (University of Reading)
  • Changing Your Archive From a Black Hole to a Gold MineJay Lofstead (Sandia National Laboratories)
    Archives have been a crucial part of long term data storage ensuring that future users can refer back to previous work to check new results integrity or to deal with compliance and legal requirements. However, making these archives more useful than write-once, read-never systems has been challenging. Tape speeds, when compared to other storage media, have high latency making any interactive exploration painful. POSIX style attributes and extended attributes can only offer so much additional information. We are exploring offering another layer on top of the archive that can make archive item selection more efficient and effective turning your data black hole into a gold mine.
  • Approaches to Programming Extremely Heterogenous Memory SystemsJeffrey Vetter (ORNL)
  • The goldilocks node: getting the RAM just rightJulian Kunkel (on behalf of the collaboration))
  • NGI initiative: toward a bridge in the semantic gapJean-Thomas Acquaviva (DDN)
    Most of the recent innovations in the storage area have been driven by the cloud emergence. This had led to a race toward genericity and ultimately to the now ubiquitous object interface. While this has been highly beneficial for the community the absence of semantic is a drawback for the HPC ecosystem. The NGI initiative is an effort coming from the weather forecast community aiming at bringing *more* semantic in the data format. We advocate to bring application closer to data and pave the way for on situ computing data format should be richer and more specific.
  • The community can make the differenceJulian Kunkel (University of Reading)
  • Discussion
1)
ESiWACE has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 675191