Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
events:2019:pasc-exabyte [2019-07-06 13:23]
Julian Kunkel
events:2019:pasc-exabyte [2019-07-08 18:54] (current)
Julian Kunkel
Line 32: Line 32:
   * 13:30 **Fighting the Data Deluge with Data-Centric Middleware** -- //Julian Kunkel// \\ //The Exabyte of storage occupied by computational simulations is reached long before Exaflop systems are built. Motivated by workflows in climate and weather, in the ESiWACE project the Earth System Data Middleware is developed that focuses on optimizing performance throughout the heterogeneous storage landscape. The talk concludes by discussing the need for community development of standards that will lead to next-generation interfaces that will enable data-centric processing.//​ \\ {{ :​research:​talks:​2019:​2019-06-14-fighting_the_data_deluge_with_data_centric_middleware.pdf |Slides}}   * 13:30 **Fighting the Data Deluge with Data-Centric Middleware** -- //Julian Kunkel// \\ //The Exabyte of storage occupied by computational simulations is reached long before Exaflop systems are built. Motivated by workflows in climate and weather, in the ESiWACE project the Earth System Data Middleware is developed that focuses on optimizing performance throughout the heterogeneous storage landscape. The talk concludes by discussing the need for community development of standards that will lead to next-generation interfaces that will enable data-centric processing.//​ \\ {{ :​research:​talks:​2019:​2019-06-14-fighting_the_data_deluge_with_data_centric_middleware.pdf |Slides}}
   * 14:00 **The CERN Tape Archive: Preparing for the Exabyte Storage Era** -- //Michael Davis// \\ {{ :​events:​2019:​pasc-minisymposium-davis.pdf |Slides}} \\ //The High Energy Physics experiments at CERN generate a deluge of data which must be efficiently archived for later retrieval and analysis. During the first two Runs of the LHC (2009-2018),​ over 250 Pb of physics data was collected and archived to tape. CERN is facing two main challenges for archival storage over the next decade. First, the rate of data taking and the total volume of data will increase exponentially due to improvements in the luminosity and availability of the LHC and upgrades to the detectors and data acquisition system. Data archival is expected to reach 150 Pb/year during Run–3 (2021-2023),​ increasing to 400 Pb/year during Run–4 (2025-). The integrated total data on tape will exceed one Exabyte within a few years from now. Second, constraints in available computing power and disk capacity will change the way in which archival storage is used by the experiments. This presentation will describe these challenges and outline the preparations that the CERN IT Storage Group are making to prepare for the Exabyte storage era.//   * 14:00 **The CERN Tape Archive: Preparing for the Exabyte Storage Era** -- //Michael Davis// \\ {{ :​events:​2019:​pasc-minisymposium-davis.pdf |Slides}} \\ //The High Energy Physics experiments at CERN generate a deluge of data which must be efficiently archived for later retrieval and analysis. During the first two Runs of the LHC (2009-2018),​ over 250 Pb of physics data was collected and archived to tape. CERN is facing two main challenges for archival storage over the next decade. First, the rate of data taking and the total volume of data will increase exponentially due to improvements in the luminosity and availability of the LHC and upgrades to the detectors and data acquisition system. Data archival is expected to reach 150 Pb/year during Run–3 (2021-2023),​ increasing to 400 Pb/year during Run–4 (2025-). The integrated total data on tape will exceed one Exabyte within a few years from now. Second, constraints in available computing power and disk capacity will change the way in which archival storage is used by the experiments. This presentation will describe these challenges and outline the preparations that the CERN IT Storage Group are making to prepare for the Exabyte storage era.//
-  * 14:30 **The Met Office Cold Storage Future: Tape or Cloud?** -- //Richard Lawrence// \\ //The Met Office hosts one of the largest environmental science archives in the world using tape as the primary storage mechanism. The archive has over 275 petabytes of data stored today and a growth that is expected to exceed 5 exabytes during the next decade. When combined with data ingress and egress rates that exceed 200 terabytes a day each, the Met Office needs to ensure that the archive does not become the bottleneck to the production of our operational Weather forecasts and research needs. We will examine the current archive system, look at its current pain points and how the Met Office expects the needs of the archive to change in the short term. We then outline the UK government principle of ‘cloud first’ for digital designs and see if this can be applied to large scale Science IT. The talk will then progress onto how our current approach measures up to public cloud offerings looking at the capability, risk, benefits and costs of a cloud based archive.+  * 14:30 **The Met Office Cold Storage Future: Tape or Cloud?** -- //Richard Lawrence// ​\\ {{ :​events:​2019:​pasc-exabyte-lawrence.pdf |Slides}} ​\\ //The Met Office hosts one of the largest environmental science archives in the world using tape as the primary storage mechanism. The archive has over 275 petabytes of data stored today and a growth that is expected to exceed 5 exabytes during the next decade. When combined with data ingress and egress rates that exceed 200 terabytes a day each, the Met Office needs to ensure that the archive does not become the bottleneck to the production of our operational Weather forecasts and research needs. We will examine the current archive system, look at its current pain points and how the Met Office expects the needs of the archive to change in the short term. We then outline the UK government principle of ‘cloud first’ for digital designs and see if this can be applied to large scale Science IT. The talk will then progress onto how our current approach measures up to public cloud offerings looking at the capability, risk, benefits and costs of a cloud based archive.
 // //
   * 14:30 **ECMWF'​s Extreme Data Challenges Towards a Exascale Weather Forecasting System** -- //Tiago Quintino, Simon Smart, James Hawkes, Baudouin Raoult// \\ //​CMWF'​s operational weather forecast generates massive I/O in short bursts, currently approaching 100 TiB per day, in two hour-long windows. From this output, millions of user-defined daily products are generated and disseminated to member states and commercial clients all over the world. As ECMWF aims to achieve Exascale NWP by 2025, we expect to handle around 1 PiB of model data per day and generate 100's of millions daily products. This poses a strong challenge to a complex workflow that is already facing I/O bottlenecks. To help tackle this challenge, ECMWF is developing multiple solutions and changes to its workflows, and incrementally bringing them into operations. For example, it has developed a high-performance distributed object-store that manages the model output, for the needs of our NWP and Climate simulations,​ making data available via scientific meaningful requests, which integrate seamlessly with the rest of the operational workflow. We will present how ECMWF is leveraging this and other technologies to address current performance issues in our operations, while at the same time preparing for technology changes in the hardware and system landscape and the convergence between HPC and Cloud provisioning.//​   * 14:30 **ECMWF'​s Extreme Data Challenges Towards a Exascale Weather Forecasting System** -- //Tiago Quintino, Simon Smart, James Hawkes, Baudouin Raoult// \\ //​CMWF'​s operational weather forecast generates massive I/O in short bursts, currently approaching 100 TiB per day, in two hour-long windows. From this output, millions of user-defined daily products are generated and disseminated to member states and commercial clients all over the world. As ECMWF aims to achieve Exascale NWP by 2025, we expect to handle around 1 PiB of model data per day and generate 100's of millions daily products. This poses a strong challenge to a complex workflow that is already facing I/O bottlenecks. To help tackle this challenge, ECMWF is developing multiple solutions and changes to its workflows, and incrementally bringing them into operations. For example, it has developed a high-performance distributed object-store that manages the model output, for the needs of our NWP and Climate simulations,​ making data available via scientific meaningful requests, which integrate seamlessly with the rest of the operational workflow. We will present how ECMWF is leveraging this and other technologies to address current performance issues in our operations, while at the same time preparing for technology changes in the hardware and system landscape and the convergence between HPC and Cloud provisioning.//​