Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
events:2019:sc-analyzing-io [2019-11-05 16:48]
Julian Kunkel [Speakers (in order of their appearance)]
events:2019:sc-analyzing-io [2019-11-21 00:31] (current)
Julian Kunkel
Line 29: Line 29:
  
 The agenda is currently in preparation. The agenda is currently in preparation.
-We have a series of talks followed by a longer discussion:+We have a series of (10 minute) ​talks followed by a longer discussion:
  
-  * **Introduction** -- //Shane Snyder//+  * **Introduction** -- //Shane Snyder// ​\\ {{ :​events:​2019:​sc19-analyzing-snyder.pdf |Slides}}
   * **What'​s new with Darshan** -- //Shane Snyder//   * **What'​s new with Darshan** -- //Shane Snyder//
-  * **HPC Storage as a Blank Canvas in Google Cloud** -- //Dean Hildebrand (Google)//​ +  * **HPC Storage as a Blank Canvas in Google Cloud** -- //Dean Hildebrand (Google)// ​\\ {{ :​events:​2019:​sc19-analyzing-hildebrand.pdf |slides}} 
-  * -- //Eugen Betke (DKRZ)// +  ​* **Timeline-based I/O Behavior Assessment of Parallel Jobs**-- //Eugen Betke (DKRZ)// ​\\ {{ :​events:​2019:​sc19-analyzing-betke.pdf |Slides}} 
-  * -- //Kevin Huck (University of Oregon)// +  ​* **Measuring I/O with TAU** -- //Kevin Huck (University of Oregon)// ​\\ {{ :​events:​2019:​sc19-analyzing-tau.pdf |Slides}}  
-  * **Research community I/O patterns** -- //Gordon Gibb (EPCC)// \\ We have used a combination of Cray LASSi and EPCC SAFE to analyse the I/O profiles for different research communities based on analysing the I/O in all jobs on the UK National Supercomputing Service, ARCHER over a period of 6 months. The patterns reveal the different I/O requirements of different communities and will allow us to design better HPC services in the future.+  * **State of IO profiling in Forge** -- //Florent Lebeau (ARM)// \\ {{ :​events:​2019:​sc19-analyzing-forge.pdf |Slides}} 
 +  * **Research community I/O patterns** -- //Gordon Gibb (EPCC)// \\ [[https://​epcced.github.io/​sc19-analyzing_io_bof-archer_io/​SC19-VI4IO_SAFELASSi_Nov2019/​|Slides]] \\ //We have used a combination of Cray LASSi and EPCC SAFE to analyse the I/O profiles for different research communities based on analysing the I/O in all jobs on the UK National Supercomputing Service, ARCHER over a period of 6 months. The patterns reveal the different I/O requirements of different communities and will allow us to design better HPC services in the future.// 
 +  * **Tracking User-Perceived I/O Slowdown via Probing** -- //Julian Kunkel// \\ {{ :​events:​2019:​sc-bof19-analyzing-probing.pdf |Slides}}
   * **Panel and discussion** -- (moderated by Julian Kunkel)   * **Panel and discussion** -- (moderated by Julian Kunkel)
  
Line 43: Line 45:
 ===== Speakers (sorted by their lastname) ===== ===== Speakers (sorted by their lastname) =====
  
-**Eugen Betke** ​-- has completed his study of computer science in 2015 with specialization on machine learning and I/O performance. In his master thesis he applied machine learning methods to predict I/O performance. At the beginning of 2016 he started as a researcher at the German Climate Computing Center. His key areas are analysis and optimization of HPC-I/O; he, for example, developed a cluster wide monitoring system for Lustre on Mistral.+**Eugen Betke** has completed his study of computer science in 2015 with specialization on machine learning and I/O performance. In his master thesis he applied machine learning methods to predict I/O performance. At the beginning of 2016 he started as a researcher at the German Climate Computing Center. His key areas are analysis and optimization of HPC-I/O; he, for example, developed a cluster wide monitoring system for Lustre on Mistral.
  
 **Dr Gordon Gibb** is an Applications Consultant at EPCC, the University of Edinburgh. He obtained an MPhys in Astrophysics at the University of St Andrews, where he then went on to receive a PhD in Solar Physics, followed by several positions as a research software engineer. At EPCC, he is a member of the computer science and engineering team for ARCHER, the UK's national supercomputing service. His work has included general technical support for ARCHER users, optimisation and porting of codes to ARCHER, and he is the point of contact for one of the UK’s high end computing consortia. ​ **Dr Gordon Gibb** is an Applications Consultant at EPCC, the University of Edinburgh. He obtained an MPhys in Astrophysics at the University of St Andrews, where he then went on to receive a PhD in Solar Physics, followed by several positions as a research software engineer. At EPCC, he is a member of the computer science and engineering team for ARCHER, the UK's national supercomputing service. His work has included general technical support for ARCHER users, optimisation and porting of codes to ARCHER, and he is the point of contact for one of the UK’s high end computing consortia. ​
Line 49: Line 51:
 **Dean Hildebrand** is a Technical Director of HPC and enterprise storage in the Office of the CTO (OCTO) at Google Cloud. ​ He has authored over 100 scientific publications and patents and currently focused on making HPC and enterprise storage 1st class citizens in the cloud. He received a B.Sc. degree in computer science from the University of British Columbia in 1998 and M.S. and PhD. degrees in computer science from the University of Michigan in 2003 and 2007, respectively. **Dean Hildebrand** is a Technical Director of HPC and enterprise storage in the Office of the CTO (OCTO) at Google Cloud. ​ He has authored over 100 scientific publications and patents and currently focused on making HPC and enterprise storage 1st class citizens in the cloud. He received a B.Sc. degree in computer science from the University of British Columbia in 1998 and M.S. and PhD. degrees in computer science from the University of Michigan in 2003 and 2007, respectively.
  
-**Julian Kunkel** -Dr. Kunkel is a Lecturer at the Computer Science Department at the University of Reading. He manages several research projects revolving around High-Performance Computing and particularly high-performance storage. Besides his main goal to provide efficient and performance-portable I/O, his HPC-related interests are: data reduction techniques, performance analysis of parallel applications and parallel I/O, management of cluster systems, cost-efficiency considerations,​ and software engineering of scientific software. ​+ 
 +**Kevin Huck** is Research Faculty and Computer Scientist in the Oregon Advanced Computing Institute for Science and Society at the University of Oregon. Dr. Huck was awarded his PhD (2009) in Computer and Information Science from the University of Oregon. Previously, Dr. Huck has worked in various private industry efforts and as a postdoc at the Barcelona Supercomputing Center (2009-2011). ​Dr. Huck works primarily in the area of large scale parallel performance measurement,​ analysis and visualization. Dr. Huck is the creator and primary developer of APEX (Autonomic Performance Environment for eXascale), a measurement and feedback-control infrastructure for asynchronous,​ user-level threading runtimes like OpenMP and HPX. He is also the creator and primary developer of PerfExplorer,​ a data mining framework for large scale parallel performance analysis. His other research interests include application and workflow performance measurement,​ analysis, aggregation and visualization as well as lightweight measurement,​ dynamic runtime optimization and feedback/​control systems for asynchronous multitasking runtimes. 
 + 
 +**Julian ​Kunkel** is a Lecturer at the Computer Science Department at the University of Reading. He manages several research projects revolving around High-Performance Computing and particularly high-performance storage. Besides his main goal to provide efficient and performance-portable I/O, his HPC-related interests are: data reduction techniques, performance analysis of parallel applications and parallel I/O, management of cluster systems, cost-efficiency considerations,​ and software engineering of scientific software.  
 + 
 +**Florent Lebeau** is a solution architect at Arm providing effective customer training across the broad range of debugging, profiling and optimization tools. Having worked in HPC for many years, Florent brings valuable knowledge and experience in the practical use of parallel programming and development tools, joining the Arm HPC Tools team after working as an engineer for Allinea Software and at CAPS enterprise where he developed profiling tools for HMPP Workbench and provided training on parallel technologies. Florent graduated from the University of Dundee with an MSc in Applied Computing. 
 + 
 +**Shane Snyder** is a software engineer in the Mathematics and Computer 
 +Science Division of Argonne National Laboratory. He received his 
 +master'​s degree in computer engineering from Clemson University in 2013. 
 +His research interests primarily include the design of high-performance 
 +distributed storage systems and the characterization and analysis of I/O 
 +workloads on production HPC systems.