From a collection of benchmarks (UEABS, ARM-HPC, UoB-HPC), suitable HPC code patterns for code optimization will be identified using various profilers (perf, likwid, oprofile). The aim of this work is to conduct a performance analysis of the aforementioned benchmarks and to create a classification of the HPC code characteristics. In addition to the objectives and functionality of the benchmarks, their configurations, and their applications, the interpretation of the data will also be discussed. Furthermore, the benchmark will be run on different compilers and hardware, and a performance comparison will be drawn. The goal is to perform a performance analysis of the aforementioned benchmarks and create a classification of the characteristics of the HPC codes. Key topics and notes: Working on high-performance computing infrastructure in a Linux environment, Configuration and installation of benchmarks, Analysis of benchmarks with various profilers and compilers on different hardware architectures, Comparison of benchmark results across hardware.
The arch project, with its various mini-applications from the field of physics, is the focus of this student project. In addition to a systematic comparison, performance comparisons are also drawn. Further details can be found in the section on key topics. The arch project focuses on the various mini-applications from physics. This student project focuses on the arch project and its various mini-applications from physics. In addition to a systematic comparison, performance comparisons are also drawn. Content Focus and Notes: Configuration and installation of the ARCH Benchmark, Detailed description of arch and implementation using one of the mini-apps (hot, flow, neutral, hal3d), Running a mini-app (hot, flow, neutral, hal3d) on the GWDG Cluster, Detailed overview of the selected mini-apps of the arch project and performance comparison of the mini-apps, Analysis of the benchmarks/mini-apps with various profilers and compilers on different hardware architectures, Performance comparison of the benchmark results across different hardware.
This student project focuses on the Jacobi benchmark and investigates its theoretical foundations as well as its performance characteristics on modern computing architectures. The introductory seminar is based on the paper “Performance Evaluation of Jacobi Iterative Solution for Sparse Linear Equation System on Multicore and Manycore Architectures.” It includes a detailed elaboration of the mathematical theory behind the Jacobi iterative method and a systematic comparison between multicore and manycore architectures. Furthermore, a performance analysis of the Jacobi implementation is conducted on the GWDG cluster. The advanced seminar shifts the focus to Sparse Matrix-Vector Multiplication (SpMV), following the paper “Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs).” In this context, suitable performance metrics are selected and justified, and the performance of SpMV on CPUs and GPUs is compared, highlighting the superior performance of GPUs as discussed in the literature. Additionally, different sparse matrix formats are implemented and evaluated on the GWDG cluster. The assignment paper is based on “Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Modern Multi- and Many-Core Processors.” Here, the optimization tuning methodology is applied to improve SpMV performance on the GWDG cluster. Finally, the CSE research project builds on “Vectorizing Sparse Matrix Computations with Partially-Strided Codelets.” This part includes a detailed code pattern analysis of SpMV and the implementation of partially-strided codelets to enhance vectorization and overall performance on the GWDG cluster. Overall, the project combines theoretical analysis with practical performance evaluation and optimization of iterative solvers and sparse matrix computations on modern high-performance computing systems.
This student project focuses on the SWE (Shallow Water Equations) benchmark and combines theoretical analysis with comprehensive performance evaluation. In the introductory seminar, the SWE benchmark code is executed on the GWDG cluster. A detailed description of the Shallow Water code is provided, including an in-depth analysis of its implementation details and a theoretical interpretation of the obtained performance results. The advanced seminar is based on the paper “Mesh Orientation and Cell Size Sensitivity in 2D SWE Solvers.” In this context, different 2D SWE solvers are executed and analyzed on the GWDG cluster, with particular attention to how mesh orientation and cell size influence numerical accuracy and computational performance. The assignment paper extends this work by systematically comparing the performance of different SWE approaches on the GWDG cluster. This includes comparisons between 1D and 2D implementations, iterative and parallel versions, and various solver strategies. Additional literature on massively parallel solver generation and finite element methods provides further theoretical background. Overall, the project integrates theoretical foundations of the shallow water equations with practical benchmarking and performance comparisons of multiple solver implementations in C++ and Python, emphasizing scalability and efficiency on modern high-performance computing systems.