Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
research:start [2020-04-22 18:31]
Julian Kunkel [High-Performance Computing]
research:start [2020-04-22 18:31] (current)
Julian Kunkel [High-Performance Storage]
Line 65: Line 65:
 ===== High-Performance Storage ===== ===== High-Performance Storage =====
  
-High-Performance storage provides the hardware and software technology to query and store largest data volumes at high velocity of input/​output while ensuring data consistency. +High-Performance storage provides the hardware and software technology to query and store the largest data volumes at high velocity of input/​output while ensuring data consistency. 
-On Exascale systems -- i.e, systems processing 10^18 floating point operations per second, workflows will harness 100k of processors with millions of threads while producing or reading Petabytes to Exabytes of data. +On Exascale systems -- i.e, systems processing 10^18 floating-point operations per second, workflows will harness 100k of processors with millions of threads while producing or reading Petabytes to Exabytes of data. 
  
 Traditionally, ​ {{ :​research:​layers.png?​200|Typical HPC I/O Stack}} Traditionally, ​ {{ :​research:​layers.png?​200|Typical HPC I/O Stack}}
-a parallel application uses an I/O middleware such as NetCDF or HDF5 to access data. The middleware provides ​data access and manipulation API for higher-level objects such as variables. Naturally, these interfaces provide operations for accessing and manipulating data that are tailored to the needs of users. Historically,​ such middleware is also responsible for conversion of the user data into the file, which is just a byte-array. Therefore, it uses a widely available file system interface like POSIX +a parallel application uses an I/O middleware such as NetCDF or HDF5 to access data. The middleware provides data access and manipulation API for higher-level objects such as variables. Naturally, these interfaces provide operations for accessing and manipulating data that are tailored to the needs of users. Historically,​ such middleware is also responsible for the conversion of the user data into the file, which is just a byte-array. Therefore, it uses a widely available file system interface like POSIX 
-or MPI-IO. In the last decade, data centers realized that existing I/O middleware is unable to exploit the deployed parallel file systems in HPC systems for various reasons. As a consequence,​ data centers started to develop new middleware such as PNetCDF, SIONlib, GLEAN, ADIOS, PLFS and data clay. Various of these interfaces are now used in applications that run on large scale machines.+or MPI-IO. In the last decade, data centers realized that existing I/O middleware is unable to exploit the deployed parallel file systems in HPC systems for various reasons. As a consequence,​ data centers started to develop new middleware such as PNetCDF, SIONlib, GLEAN, ADIOS, PLFSand data clay. Various of these interfaces are now used in applications that run on large scale machines.
  
-Recent advances in new storage technologies,​ like in-memory storage and non-volatile memory promise to bring high capacity, non-volative ​storage with performance characteristics (latency/​bandwidth/​energy consumption) that bridge the gap between DDR memory and SSD/HDD. +Recent advances in new storage technologies,​ like in-memory storage and non-volatile memory promise to bring high capacity, non-volatile ​storage with performance characteristics (latency/​bandwidth/​energy consumption) that bridge the gap between DDR memory and SSD/HDD. 
-However, they require ​careful integration into the existing I/O stack or demand the development of next-generation storage systems. ​+However, they require careful integration into the existing I/O stack or demand the development of next-generation storage systems. ​
  
  
-**Future high-performance storage systems** will need to have internal management systems and interfaces which provide capabilities far beyond those currently possible. In particular, they need to support fine grained data access ​prioritisation, and adaptive improved performance using internal replication and revised data layouts, all with acceptable resiliency. This must all be achieved in the presence of millions of simultaneous threads, not all under a single application’s control, and all doing I/O. Where multiple tiers are present data replication and migration should optimally adapt on the fly to the requirements of +**Future high-performance storage systems** will need to have internal management systems and interfaces which provide capabilities far beyond those currently possible. In particular, they need to support fine-grained data access ​prioritization, and adaptive improved performance using internal replication and revised data layouts, all with acceptable resiliency. This must all be achieved in the presence of millions of simultaneous threads, not all under a single application’s control, and all doing I/O. Where multiple tiers are present data replication and migration should optimally adapt on the fly to the requirements of 
-individual workflows and the overall system load. All this must be achieved with system interoperability and standardised ​application programming interfaces (APIs). Additionally,​ data centres ​are seeing challenges supporting mixed workflows of HPC and data analytics; The general consensus is that this needs to change and requires new methods and thinking about how to access storage, describe data and manage workflows.+individual workflows and the overall system load. All this must be achieved with system interoperability and standardized ​application programming interfaces (APIs). Additionally,​ data centers ​are seeing challenges supporting mixed workflows of HPC and data analytics; The general consensus is that this needs to change and requires new methods and thinking about how to access storage, describe data and manage workflows.
  
 The efficient, convenient, and robust data management and execution of data-driven workflows are key for productivity in computer-aided RD&E particularly for data-intense research such as climate/​weather with complex processing workflows. Still, the storage stack is based on low-level I/O that requires complex manual tuning. The efficient, convenient, and robust data management and execution of data-driven workflows are key for productivity in computer-aided RD&E particularly for data-intense research such as climate/​weather with complex processing workflows. Still, the storage stack is based on low-level I/O that requires complex manual tuning.
Line 83: Line 83:
 One key benefit of these systems is the exploitation of heterogeneous storage and compute infrastructures by scheduling user workloads efficiently across a system topology -- a concept called Liquid Computing. One key benefit of these systems is the exploitation of heterogeneous storage and compute infrastructures by scheduling user workloads efficiently across a system topology -- a concept called Liquid Computing.
 These systems can improve the data handling over time without user intervention and lead towards an era with smart system infrastructure. These systems can improve the data handling over time without user intervention and lead towards an era with smart system infrastructure.
-They bear the opportunity to become the core I/O infrastructure in scientific computing but also enable to host big-data tools like Spark in an efficient manner.+They bear the opportunity to become the core I/O infrastructure in scientific computing but also enable ​us to host big-data tools like Spark in an efficient manner.
  
 **We believe -- intelligent storage systems are the solution.** **We believe -- intelligent storage systems are the solution.**