====== BoF: Analyzing Parallel I/O ====== ===== Abstract ===== Parallel I/O performance can be a critical bottleneck for applications, yet users are often ill-equipped for identifying and diagnosing I/O performance issues. Increasingly complex hierarchies of storage hardware and software deployed on many systems only compound this problem. Tools that can effectively capture, analyze, and tune I/O behavior for these systems empower users to realize performance gains for many applications. In this BoF, we form a community around best practices in analyzing parallel I/O and cover recent advances to help address the problem presented above, drawing on the expertise of users, I/O researchers, and administrators in attendance. The primary objectives of this BoF are to: 1) highlight recent advances in tools and techniques for monitoring I/O activity in data centers, 2) to discuss experiences and limitations of current approaches, 3) to discuss and derive a roadmap for future I/O tools with the goal to capture, assess, predict and optimize I/O. The BoF is held in conjunction with the [[http://sc22.supercomputing.org/|Supercomputing conference]]. The official announcement is listed [[https://sc22.supercomputing.org/?post_type=page&p=3479&id=bof110&sess=sess369|here]]. || Date || 17.November 2022 || || Time || 12:15pm - 1:15pm CST || || Venue || [[https://sc22.supercomputing.org/?post_type=page&p=3479&id=bof110&sess=sess369|C146, see the SC schedule for details]] || The BoF is powered by the [[https://www.vi4io.org|Virtual Institute for I/O]] and [[http://www.esiwace.eu|ESiWACE]] ((ESiWACE has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 823988)). {{:events:2017:vi4io.png?200&nolink|}} \w {{:research:projects:esiwace-logo.png?300&nolink|}} ===== Organization ===== The BoF is organized by * Shane Snyder (ANL, USA), [[ssnyder@mcs.anl.gov]] * [[about:people:julian_kunkel|Julian Kunkel]] (Georg-August-Universität Göttingen/GWDG), [[julian.kunkel@gwdg.de]] ===== Agenda ===== We have a series of (8 minute) talks followed by a longer discussion: * **Welcome** -- //Shane Snyder, Julian Kunkel// \\ {{ :events:2022:sc22-analyzing-intro.pdf |Slides}} * **Detecting data races on relaxed systems using Recorder** -- //Chen Wang (LLNL)// \\ {{ :events:2022:sc22-analyzing-detecting_data_races_on_storage_systems_using_recorder.pdf |Slides}} * **Non-Intrusive Monitoring and I/O Classification with IOFS** -- //Christian Boehme (GWDG)// \\ {{ :events:2022:sc22-analyzing-iofs.pdf |Slides}} * **Monitoring with Vast** -- //Rob Mallory (VAST)// \\ [[https://youtu.be/TG1cNllzPxA|Video]] * **Visualizing I/O bottlenecks with DXT Explorer 2.0** -- //Jean-Luca Bez (LBL)// \\ {{ :events:2022:sc22-analyzing-dxt_explorer_2.0.pdf |Slides}} \\ DXT Explorer is an interactive web-based log analysis tool to visualize Darshan DXT logs and aid in understanding the I/O behavior of scientific applications. In recent work, we have enriched DXT Explorer with novel visualizations toward detecting root causes of performance bottlenecks. By detecting and highlighting I/O phases, stragglers, and unbalanced workloads, we can guide users to solve I/O slowdowns when transferring data. Our tool is open-source and available at [[https://github.com/hpc-io/dxt-explorer|https://github.com/hpc-io/dxt-explorer]]. * **Darshan I/O Runtime Monitoring** -- //Ann Gentile (Sandia National Laboratories)// \\ Slides * **Panel and discussion** -- ===== Speakers (sorted by their lastname) =====