Theses

You can find a list of past published theses here: Theses.

If you are interested in a thesis with us, feel free to browse through this list. To get an overview over our groups' current topics, see below.

If not stated differently, the following offered theses below are intended for M.Sc. but can also be reduced in scope and handed out as B.Sc. theses. We are also always open for your ideas.

Benchmarking and Characterization of Workflow Execution in Heterogeneous HPC SystemsApply

Accurate benchmarking is a prerequisite for meaningful workload mapping and scheduling research. This thesis focuses on designing and executing systematic benchmarks for heterogeneous HPC systems using workflow-based workloads. The student will characterize system properties (compute, memory, I/O, network) and workload behavior (task duration, data transfer, dependencies) using real and synthetic workflows. The outcome will be a reproducible benchmarking methodology and datasets that can be used as ground truth for evaluating optimization and AI-based schedulers.

Aasish Kumar Sharma HPCWorkflow BenchmarkingPerformance Engineering

Hybrid Scheduling: Combining Exact Solvers and Learning-Based Methods for HPC WorkflowsApply

Exact solvers provide optimal solutions but scale poorly, while learning-based methods scale well but lack guarantees. This thesis investigates hybrid scheduling strategies that combine MILP or CP-SAT solutions on small subproblems with learning-based generalization for larger workflows. The focus is on feasibility preservation and performance trade-offs.

Aasish Kumar Sharma Hybrid OptimizationHPCAI Scheduling

Quantum-Inspired Optimization for Workflow Mapping in Heterogeneous HPC SystemsApply

This thesis explores quantum-inspired optimization techniques, such as QUBO formulations, for workflow mapping problems in heterogeneous HPC environments. The student will translate classical scheduling constraints into QUBO models and compare solution feasibility and scalability against classical solvers.

Aasish Kumar Sharma Quantum-Inspired OptimizationQUBOHPC Scheduling

Ethical and Responsible AI Considerations in Automated HPC Scheduling SystemsApply

As AI-driven schedulers increasingly influence resource allocation decisions, ethical considerations such as fairness, transparency, and accountability become critical. This thesis examines ethical risks in automated HPC scheduling and proposes evaluation criteria or design guidelines for responsible scheduling systems, grounded in real HPC use cases.

Aasish Kumar Sharma Ethical AIHPCResponsible Computing

Constraint-Based Workflow Scheduling Using MILP and CP-SAT: A Comparative StudyApply

Constraint programming and mixed-integer linear programming are widely used for exact workflow scheduling but exhibit different scalability and modeling trade-offs. This thesis implements and compares MILP and CP-SAT formulations for workflow mapping and scheduling under heterogeneous resource constraints. The student will evaluate solution quality, feasibility guarantees, and solver performance across increasing problem sizes.

Aasish Kumar Sharma Constraint ProgrammingMILPWorkflow Scheduling

Modeling System and Workload Characteristics for Workflow Scheduling in the HPC Compute ContinuumApply

This project investigates how heterogeneous system resources and workflow characteristics can be modeled in a structured and extensible manner. The student will design data models for nodes, tasks, features, and performance properties, aligned with real HPC schedulers and workflow managers. The work emphasizes practical modeling choices that balance expressiveness and solvability, and results in machine-readable system and workload descriptions usable by optimization solvers.

Aasish Kumar Sharma HPCWorkflow ModelingHeterogeneous Systems

Designing an Environmental Sustainability Labeling System for AI Services Based on Resource UsageApply

This thesis addresses the need for transparency in the energy and resource consumption of AI services by proposing a standardized environmental sustainability labeling system. The project analyzes the energy consumption and computational load of different AI tasks—such as classification, generation, and scheduling—and translates them into simple, interpretable labels similar to those used for home appliances (e.g., A++ to E). The student will collect runtime and resource usage data for AI models, evaluate their environmental impact, and propose a standardized method to present this information to users and developers.

Michael Bidollahkhani AI SustainabilityEnergy EfficiencyEnvironmental LabelingResponsible AI

Meta Machine Intelligence (MMI) for Error Detection in High-Performance Computing SystemsApply

This project investigates how context-sensitive AI models can improve early fault detection in high-performance computing (HPC) environments. The objective is to implement an adaptive mechanism that selects among pre-trained machine learning models based on system state, workload behavior, and observed error patterns. The research includes defining relevant system contexts, integrating multiple detection models, and evaluating their effectiveness under different runtime conditions using real or simulated log datasets. The expected outcome is improved fault detection reliability while maintaining scalability across heterogeneous HPC architectures.

Michael Bidollahkhani Error DetectionHigh-Performance ComputingAdaptive AISystem MonitoringMeta-Machine Intelligence

Multi-Model Job Scheduling for Mixed Computing EnvironmentsApply

This thesis focuses on designing a context-aware job scheduling system powered by multiple AI models for heterogeneous computing environments, including cloud, edge, and HPC systems. The scheduler dynamically selects the most suitable scheduling model based on job characteristics, resource availability, and historical performance data. The study involves developing an adaptive AI-based scheduler that responds to varying resource types and infrastructure constraints, with the goal of improving overall scheduling efficiency in complex, multi-layered computing systems.

Michael Bidollahkhani Job SchedulingAI for ComputingHeterogeneous SystemsPerformance OptimizationMeta-Machine Intelligence

Lightweight AI for Detecting Irregular Behavior in Device LogsApply

This thesis aims to develop a minimal and efficient AI-based anomaly detection system for identifying irregular behavior in log files generated by small-scale devices or sensors. The system is optimized for environments with limited memory and computational resources, such as embedded systems and low-power devices. Contextual indicators—such as temperature readings, timestamp frequency, and error patterns—are incorporated to improve detection accuracy and relevance. The resulting solution targets real-world monitoring scenarios in edge computing and IoT deployments.

Michael Bidollahkhani Anomaly DetectionEmbedded SystemsDevice LogsLightweight AI

Interactive Dashboard for Monitoring AI Performance in System MaintenanceApply

This project involves the design and implementation of a web-based interactive dashboard for monitoring and visualizing AI behavior in system maintenance tasks. The dashboard presents information such as tool selection, confidence levels, warning predictions, and input variations over time in an interpretable and user-friendly manner. It supports AI models that can be configured with different data sources and toolsets, aiming to enhance transparency, trust, and usability in predictive maintenance systems operating in dynamic, multi-model environments.

Michael Bidollahkhani VisualizationAI MonitoringWeb DashboardSystem Maintenance

How do students use AI in their studies?Apply

Students and especially computer science students are among both the early adopters of generative artificial intelligence and it's critics. Knowing which AI tools are used and how they are used, is important to improve learning and teaching. The goal of this bachelor thesis is to collect and evaluate data on this topic for the computer and data science courses.

Lorenz Glißmann EducationSurveyBachelor-OnlyCompatible with Teacher/Lehramt

AI-assisted programming learningApply

AI is transforming education. AI chatbots are everywhere but more useful patterns only slowly emerge. In our CS Bachelor, we use the programming learning environment SmartBeans that provides students with tasks and automatic feedback based on unit testing. But this feedback is limited in scope and usefulness. The goal of this thesis is to improve the learning experience by adding state-of-the-art AI methods that go beyond chats, improving well-known factors in efficient learning e.g. cognitive load. The focus could either be on AI or on improving learning.

Lorenz Glißmann AILLMsSoftware DevelopmentEducation

Exploring Quantum Computing Use CasesApply

In the quantum computing test center QUICS (Quantum Innovation and Computing for SME) we explore the potential of quantum computing for real-world use cases. The goal of this thesis is to characterize the (future) applicability of quantum computing for a problem defined by an industry partner or internally by the QUICS project itself. This involves characterizing the state-of-the art approach to the problem in classical computing, defining a suitable benchmark for comparisons, selecting a quantum computing approach, and defining and implementing a simplified proof of concept (PoC) that can be used on current-day quantum computers. Ideally, the thesis will use the PoC to discuss the requirements for future quantum advantage. The thesis will be conducted within the QUICS team and, when applicable, in collaboration with an industry partner.

Christian Boehme Quantum Computing

Comparison of Distributed Computing FrameworksApply

While the data analytics tool Apache Spark has already been available on GWDG systems for multiple years, Dask is an upcoming topic. Spark is primarily used with Scala (and supports Python as well), Dask on the other hand is a part of the Python ecosystem. The project proposal is to compare the deployment methods on an HPC system (via Slurm in our case), the monitoring possibilities and tooling available, and to develop, run and evaluate a concrete application example on both platforms.

Christian Köhler Data AnalyticsDistributed Computing

Integrated Analysis of High Performance Computing Training Materials: A Fusion of Web Scraping, Machine Learning, and Statistical InsightsApply

This thesis focuses on the compilation and analysis of training materials from various scientific institutions in the High Performance Computing (HPC) domain. The initial phase involves utilizing scraping techniques to gather diverse training resources from different sources. Subsequently, the study employs methods derived from Machine Learning and Statistics to conduct a comprehensive analysis of the collected materials. The research aims to provide insights into the existing landscape of HPC training materials, identify commonalities, and offer recommendations for optimizing content delivery in this crucial field.

Matthias Eulert TrainingScrapingMachine Learning

Advancing Education in High Performance Computing: Exploring Personalized Teaching Strategies and Adaptive Learning TechnologiesApply

The present thesis delves into the exciting research field of personalized teaching in High Performance Computing (HPC). The objective is to identify innovative methods and technologies that enable tailoring educational content in the field of high-performance computing to the individual needs of students. By examining adaptive learning platforms, machine learning, and personalized teaching strategies, the thesis will contribute to the efficient transfer of knowledge in HPC courses. The insights from this research aim not only to enhance teaching in high-performance computing but also to provide new perspectives for the advancement of personalized teaching approaches in other technology-intensive disciplines.

Matthias Eulert TrainingTeachingMachine Learning

Evaluating Pedagogical Strategies in High Performance Computing Training: A Machine Learning-driven Investigation into Effective Didactic ApproachesApply

This thesis delves into the realm of computer science education with a particular focus on High Performance Computing (HPC). Rather than implementing new tools, the research centers on the field of didactics, aiming to explore and assess various pedagogical concepts applied to existing HPC training materials. Leveraging Machine Learning tools, this study seeks to identify prevalent didactic approaches, analyze their effectiveness, and ascertain which strategies prove most promising. This work is tailored for those with an interest in computer science education, emphasizing the importance of refining instructional methods in the dynamic and evolving landscape of High Performance Computing.

Matthias Eulert TrainingDidactics

Reimagining and Porting a Prototype for High Performance Computing Certification: Enhancing Knowledge and Skills ValidationApply

This thesis focuses on the evolution of the certification processes within the High Performance Computing (HPC) domain, specifically addressing the adaptation and porting of an existing prototype from the HPC Certification Forum. The objective is to redefine, optimize and automate the certification procedures, emphasizing the validation of knowledge and skills in HPC. The study involves the redevelopment of the prototype to align with current industry standards and technological advancements. By undertaking this project, the research aims to contribute to the establishment of robust and up-to-date certification mechanisms and standards that effectively assess and endorse competencies in the dynamic field of High Performance Computing.

Matthias Eulert TrainingCertification

Implementation of a precice-Adapter for the particle transport simulator LIGGGHTSApply

Precice as already presented at the GöHPCoffee is a multiphysics framework which allows the combination of various simulation codes to perform coupled simulations. These can both include coupled thermal problems or topics related to fluid structure interaction. So far, there exists no possibility to perform a coupled particle simulation using preCICE since the only particle solver is not publicly available. It is the aim of this thesis to mitigate this limitation by implementing a precice-adapter for the particle solver LIGGGHTS-PFM. One possibility could be the modification of an existing OpenFOAM-adapter in preCICE. In addition, the thesis will compare the achievable performance with other coupling libraries using LIGGGHTS and its derivatives. General programming experience is required. Knowledge in simulation technology and particle transport especially in LIGGGHTS is beneficial but not mandatory.

Patrick Höhn Particle SimulationOpenFOAMprecice

Enabling particle simulations with a deformable boundaryApply

In flow loop experiments, I studied the damping of oscillations in a pipe subject to flow and particle transport. I recorded the movement with two GoPro Hero 9 cameras to have valuable absolute position data in addition to accelerations recorded by the Inertial Measurement Unit (IMUs) placed inside the inner pipe. The numerical analysis in my PhD thesis used OpenFOAM fork foam-extend as a framework for fluid and solid using solids4foam, and CFDEM®coupling-PUBLIC for coupling to the particle solver LIGGGHTS®-PUBLIC. Since then, most code development happened in the academic fork from the Department of Particulate Flow Modelling at Johannes Kepler University in Linz, Austria. Upon successful completion of the project, the applicant will gain hands-on experience with a real-world problem in the area of numerical modeling using different frameworks in C++. The developed code is planned to be upstreamed, enabling simulations currently not possible even with many commercial simulation programs.

Patrick Höhn OpenFOAMMPIC++

Framework for automated ML and empirical model generationApply

Despite drilling technology traditonally originates from the field of oil and gas, it still plays a crucial role in emerging fields of Carbon Capture and Storage, geothermal energy or hydrogen storage. In order to reach a wide adoption of the new fields it is crucial to optimize the wellbore construction costs. In my research I was using mathematical models, i.e. both statistical and empircal, to replicate scenarios generated from previous drilling projects. In my previous paper "Framework for automated generation of real-time rate of penetration models" (doi:10.1016/j.petrol.2022.110369), I created a framework for the automatic parametrization of models for a single variable based on preprocessed measurement data. These models include both empirical models from the literature and trained using machine learning algorithms from sklearn. In a recent Master Thesis, a new simulation framework was developed in Python which could use the parametrized models for research and education in the drilling industry. Compared to the implementation in the paper, the new version will integrate several models from the literature to enable a more comprehensive simulation experience both for researchers and students. Upon successful completion of the project, the applicant will gain hands-on experience with a real-world problem in the area of mathematical and ML modeling. The results are also planned to be submitted in a scientific publication, so it is your chance to get your first paper published.

Patrick Höhn Model generationData fittingSimulation

An Agentic Retrieval-Augmented Generation Framework for Modular Knowledge PipelinesApply

This research investigates an agentic Retrieval-Augmented Generation (RAG) framework in which retrieval, indexing, data preprocessing, and knowledge extraction are performed by explicitly defined and composable agents. Each agent is responsible for a specific phase of the RAG pipeline and can be dynamically selected, replaced, or orchestrated based on task requirements and domain constraints. The proposed system enables users to inject domain-specific logic into retrieval and indexing processes, moving beyond static, monolithic RAG architectures. The work evaluates the effectiveness, flexibility, and performance trade-offs of agent-based pipelines compared to conventional RAG systems.

Sadegh Keshtkar agentic-airetrieval-augmented-generationinformation-retrievalknowledge-extraction

A Collaborative Human–Agent RAG Chat System with Role-Aware Reasoning and Expert-Owned KnowledgeApply

This thesis proposes a collaborative RAG-based chat system that integrates end users, autonomous language model agents, and human domain experts within a single conversational environment. Each participant operates under a distinct role with different priorities, permissions, and perspectives over the shared conversation state. The system supports structured reasoning, planning, evaluation, and response generation, while enabling experts to maintain personalized RAG representations derived from their historical interactions and domain knowledge. Additionally, the system leverages conversational interactions to generate high-quality question–answer pairs as auxiliary training data for improving retrieval and knowledge grounding.

Sadegh Keshtkar collaborative-aihuman-in-the-looprag-chatbotmulti-agent-systems

Regulation-Aware AI Supervision: A RAG-Based Evaluation and Policy Enforcement FrameworkApply

This research proposes a regulation-aware AI supervision framework that evaluates, filters, and constrains the inputs and outputs of AI systems to ensure compliance with legal, ethical, and operational requirements. Using a Retrieval-Augmented Generation (RAG) approach, the system dynamically retrieves applicable regulations, standards, and policies and incorporates them into the decision-making and validation process. The framework supports modular policy agents that can act as evaluators, validators, or content filters, enabling region- and domain-specific governance of AI behavior.

Sadegh Keshtkar AI-governanceregulation-aware-aiRAG-supervisioncompliance-enforcement

AgentFlow: A Modular Pipeline for Coordinated Collaboration of AI AgentsApply

AgentFlow is a modular orchestration framework designed to coordinate multiple specialized AI agents for complex, multi-stage tasks across heterogeneous data modalities. The system enables structured collaboration among agents such as planners, generators, evaluators, and tool executors, allowing dynamic transitions between stages including extraction, reasoning, transformation, and output generation. AgentFlow supports hierarchical workflows, dependency management, and adaptive agent selection based on task requirements.

Sadegh Keshtkar agent-orchestrationmulti-agent-systemsmodular-pipelines

Advanced Retrieval-Augmented Generation: Improving Quality, Latency, and AdaptabilityApply

This research explores advanced Retrieval-Augmented Generation (RAG) architectures aimed at improving response quality, reducing end-to-end latency, and enhancing adaptability across domains. The work investigates techniques such as hybrid retrieval, dynamic query rewriting, contextual re-ranking, and distributed indexing strategies. The proposed system adapts retrieval and generation behavior based on domain characteristics and workload constraints, enabling scalable and context-aware AI systems.

Sadegh Keshtkar RAGinformation-retrievalcontext-aware-generation

Federated Fine-Tuning of Large Language Models in Distributed and Privacy-Preserving EnvironmentsApply

This research investigates federated learning approaches for fine-tuning large language models (LLMs) across distributed environments without centralizing data. The framework enables collaborative model improvement while preserving data privacy and security. Key challenges such as data heterogeneity, communication efficiency, system scalability, and convergence stability are addressed through adaptive aggregation, selective parameter updates, and model distillation techniques.

Sadegh Keshtkar federated-learningLLM-fine-tuningdistributed-AI

Agentic Retrieval-Augmented Generation System for Modular Knowledge PipelinesApply

This research proposes an agentic Retrieval-Augmented Generation (RAG) system in which retrieval, indexing, data preprocessing, and knowledge extraction are handled by explicitly defined and composable agents. Each agent encapsulates domain-specific logic and can be dynamically selected or orchestrated within a pipeline. The approach enables flexible, transparent, and reusable knowledge workflows, moving beyond static RAG architectures toward user-extensible retrieval systems.

Sadegh Keshtkar agentic-aiRAGknowledge-pipelinesinformation-retrieval

Segment-Wise Sequential Fine-Tuning of Large Language Models Under Memory ConstraintsApply

This research investigates a memory-efficient fine-tuning strategy for large language models (LLMs) by partitioning the model into segments that are trained sequentially. Only a subset of segments is loaded into memory at any given time, enabling training on resource-constrained hardware. The work addresses challenges such as inter-segment dependency management, gradient consistency, and efficient backpropagation across unloaded components.

Sadegh Keshtkar LLM-fine-tuningmemory-efficient-trainingmodel-partitioningresource-constrained-AI

Performance optimization of numerical simulation of condensed matter systemsApply

The naive simulation of interacting condensed matter systems is an ocean-boiling problem because of the exponential growth of the Hilbert space dimension. This offers a great opportunity to apply many analytical approximations and advanced numerical methods in HPC.

Niklas Bölter HPCTheoretical Physics

Benchmarking Applications on Cloud vs. HPC SystemsApply

In this day and age, everyone has heard of the Cloud, is likely using cloud services and most people know that you can deploy parallel applications on cloud infrastructure. Meanwhile, HPC is still stuck in its narrow niche of a select few power users and experts. Few everyday people even know what HPC means. It is easy to get access to large amounts of computing power by renting time on various cloud platforms. But how do applications deployed on a cloud service like the GWDG cloud compare to their twins deployed on HPC clusters in terms of performance? How well suited are different parallelization schemes to run on both systems? The goal of this project is to get some insight into these questions and benchmark a few applications to get concrete numbers, compare both approaches and present the results in an accessible and clear way.

Julian Rüger CloudHPCBenchmark

Putting RISC-V eval board Linux and HPC toolchains into operationApply

While the HPC world is dominated by x86 architectures, RISC-V is a promising evolving alternative. To prepare for RISC-V based HPC and get familiar with architecture specific details, Starfive Visionfive2 eval boards have been procured [1]. These need to be configured to run Linux according to documentation, compiler toolchains and libraries need to be setup and tested, some benchmark or other proof of operability performed. Familiarity with electronics equipment would be beneficial, knowledge of the Linux command line is a must. Initial testing of deploying Debian Linux, compiling a more recent custom kernel with driver support for the board and running a Gnome desktop has already been done as a student project. These first steps should be significantly easier now with the release of Debian 13 (Trixie) with official RISC-V support. The setup of a classic HPC toolchain with Slurm, a module system like lmod, potentially Spack and running a simple benchmark with an MPI-enabled application remains to be done.

Julian Rüger LinuxhardwareRISC-V

Comparison of Distributed Computing FrameworksApply

While the data analytics tool Apache Spark has already been available on GWDG systems for multiple years, Dask is an upcoming topic. Spark is primarily used with Scala (and supports Python as well), Dask on the other hand is a part of the Python ecosystem. The project proposal is to compare the deployment methods on an HPC system (via Slurm in our case), the monitoring possibilities and tooling available, and to develop, run and evaluate a concrete application example on both platforms.

Christian Köhler Data AnalyticsDistributed Computing

Performance Evaluation of LLM Inference EnginesApply

While vLLM is a widely spread inference backend engine for operating LLMs, there are alternative options that have the potential to deliver better performance by replacing or extending vLLM. Notable options are the Modular platform with MAX, ServerlessLLM and LMCache. Performance improvements may be limited to certain use cases. The overarching goal of this topic is to explore potential performance improvements for the Chat AI platform.

Jonathan Decker BenchmarkingLLMs

Operating Kubernetes with AI EngineersApply

Projects such as K8sGPT as well as MCP servers for Kubernetes enable LLMs to directly interact with Kubernetes clusters. This project aims to explore how well it is possible to maintain a given Kubernetes cluster with LLM-based engineers to complete typical maintenance tasks such as adjusting workloads and migrating between versions.

Jonathan Decker KubernetesLLMs

Prototyping a Geo-Redundancy EngineApply

As part of our goals for the SAIA platform, which operates Chat AI, we want it to operate with geo-redundancy such that even if a given geo-location experiences an outage, the service stays operational. To achieve this, a geo-redundancy engine should be prototyped, which can itself operate with multiple redundant instances and is able to synchronize service configurations across multiple geo-locations.

Jonathan Decker CloudPrototyping

Development of a new application for the SpiNNaker-2 neuromorphic computing platformApply

SpiNNaker is a new kind of computer architecture, inititally designed to efficiently perform simulations of spiking neuron networks. It consists of a large number of low-powered ARM cores, connected with an efficient message passing network. This architecture together with the flexibility of the spiking neuron model make it also ideal for accelerating other types of algorithms such as optimization problems, constrain problems, live image and signal processing, AI/ML, cellular automata, finite element simulations, distributed partial differential equations, and embedded, robotics, and low powered applications in general. As part of the Future Technology Platform, the GWDG has acquired a number of SPiNNaker boards that will be available for the thesis. In this thesis, you will develop one (or more) applications for SPiNNaker, either with the high-Level Python or low-level C/C++ software stacks, characterize your solution, compare it to a pure CPU/GPU solution (or other hardware in the Future Technologa Platform), if possible apply it to a real case study, and study the power consumption of your program.

Martin Paleico AINeuromorphicOptimizationConstraint solvingCellular automataSignal processingHPCFTPHardware

Development of Text-to-SQL/XML Conversational AI for Planarian Research DatabaseApply

The Rink Lab at the Max Planck Institute for Multidisciplinary Sciences investigates why some animals can regenerate lost body parts while others cannot, using planarian flatworms as a model system - species that range from being able to regrow an entire organism from tiny fragments to those with limited regenerative ability. We have developed a comprehensive database containing planarian genome, transcriptomes, functional annotations, and gene expression data. This Master’s thesis project will focus on creating a Text-to-SQL/XML conversational AI system that enables natural language queries of the database, including the implementation and fine-tuning of NLP/AI models for query translation and intent recognition, systematic evaluation of different approaches, and the development of a GUI-based conversational interface for intuitive database exploration.

Muhammad Rizwan Riaz AINLPSQLXMLDatabase

Identifying HPC code patterns suitable for optimizationApply

From a collection of benchmarks (UEABS, ARM-HPC, UoB-HPC), suitable HPC code patterns for code optimization will be identified using various profilers (perf, likwid, oprofile). The aim of this work is to conduct a performance analysis of the aforementioned benchmarks and to create a classification of the HPC code characteristics. In addition to the objectives and functionality of the benchmarks, their configurations, and their applications, the interpretation of the data will also be discussed. Furthermore, the benchmark will be run on different compilers and hardware, and a performance comparison will be drawn. The goal is to perform a performance analysis of the aforementioned benchmarks and create a classification of the characteristics of the HPC codes. Key topics and notes: Working on high-performance computing infrastructure in a Linux environment, Configuration and installation of benchmarks, Analysis of benchmarks with various profilers and compilers on different hardware architectures, Comparison of benchmark results across hardware.

Anja Gerbes HPCBenchmarkPerformance Engineering

Performance Evaluation of Physics Mini-AppsApply

The arch project, with its various mini-applications from the field of physics, is the focus of this student project. In addition to a systematic comparison, performance comparisons are also drawn. Further details can be found in the section on key topics. The arch project focuses on the various mini-applications from physics. This student project focuses on the arch project and its various mini-applications from physics. In addition to a systematic comparison, performance comparisons are also drawn. Content Focus and Notes: Configuration and installation of the ARCH Benchmark, Detailed description of arch and implementation using one of the mini-apps (hot, flow, neutral, hal3d), Running a mini-app (hot, flow, neutral, hal3d) on the GWDG Cluster, Detailed overview of the selected mini-apps of the arch project and performance comparison of the mini-apps, Analysis of the benchmarks/mini-apps with various profilers and compilers on different hardware architectures, Performance comparison of the benchmark results across different hardware.

Anja Gerbes HPCBenchmarkPerformance Engineering

Performance Evaluation of Jacobi Iterative SolutionApply

This student project focuses on the Jacobi benchmark and investigates its theoretical foundations as well as its performance characteristics on modern computing architectures. The introductory seminar is based on the paper “Performance Evaluation of Jacobi Iterative Solution for Sparse Linear Equation System on Multicore and Manycore Architectures.” It includes a detailed elaboration of the mathematical theory behind the Jacobi iterative method and a systematic comparison between multicore and manycore architectures. Furthermore, a performance analysis of the Jacobi implementation is conducted on the GWDG cluster. The advanced seminar shifts the focus to Sparse Matrix-Vector Multiplication (SpMV), following the paper “Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs).” In this context, suitable performance metrics are selected and justified, and the performance of SpMV on CPUs and GPUs is compared, highlighting the superior performance of GPUs as discussed in the literature. Additionally, different sparse matrix formats are implemented and evaluated on the GWDG cluster. The assignment paper is based on “Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Modern Multi- and Many-Core Processors.” Here, the optimization tuning methodology is applied to improve SpMV performance on the GWDG cluster. Finally, the CSE research project builds on “Vectorizing Sparse Matrix Computations with Partially-Strided Codelets.” This part includes a detailed code pattern analysis of SpMV and the implementation of partially-strided codelets to enhance vectorization and overall performance on the GWDG cluster. Overall, the project combines theoretical analysis with practical performance evaluation and optimization of iterative solvers and sparse matrix computations on modern high-performance computing systems.

Anja Gerbes HPCBenchmarkPerformance Engineering

Performance Evaluation of Shallow Water CodeApply

This student project focuses on the SWE (Shallow Water Equations) benchmark and combines theoretical analysis with comprehensive performance evaluation. In the introductory seminar, the SWE benchmark code is executed on the GWDG cluster. A detailed description of the Shallow Water code is provided, including an in-depth analysis of its implementation details and a theoretical interpretation of the obtained performance results. The advanced seminar is based on the paper “Mesh Orientation and Cell Size Sensitivity in 2D SWE Solvers.” In this context, different 2D SWE solvers are executed and analyzed on the GWDG cluster, with particular attention to how mesh orientation and cell size influence numerical accuracy and computational performance. The assignment paper extends this work by systematically comparing the performance of different SWE approaches on the GWDG cluster. This includes comparisons between 1D and 2D implementations, iterative and parallel versions, and various solver strategies. Additional literature on massively parallel solver generation and finite element methods provides further theoretical background. Overall, the project integrates theoretical foundations of the shallow water equations with practical benchmarking and performance comparisons of multiple solver implementations in C++ and Python, emphasizing scalability and efficiency on modern high-performance computing systems.

Anja Gerbes HPCBenchmarkPerformance Engineering

Theses

Past theses

Open Topics for MSc & BSc

Benchmarking and Characterization of Workflow Execution in Heterogeneous HPC SystemsApply

Hybrid Scheduling: Combining Exact Solvers and Learning-Based Methods for HPC WorkflowsApply

Quantum-Inspired Optimization for Workflow Mapping in Heterogeneous HPC SystemsApply

Ethical and Responsible AI Considerations in Automated HPC Scheduling SystemsApply

Constraint-Based Workflow Scheduling Using MILP and CP-SAT: A Comparative StudyApply

Modeling System and Workload Characteristics for Workflow Scheduling in the HPC Compute ContinuumApply

Designing an Environmental Sustainability Labeling System for AI Services Based on Resource UsageApply

Meta Machine Intelligence (MMI) for Error Detection in High-Performance Computing SystemsApply

Multi-Model Job Scheduling for Mixed Computing EnvironmentsApply

Lightweight AI for Detecting Irregular Behavior in Device LogsApply

Interactive Dashboard for Monitoring AI Performance in System MaintenanceApply

How do students use AI in their studies?Apply

AI-assisted programming learningApply

Exploring Quantum Computing Use CasesApply

Comparison of Distributed Computing FrameworksApply

Integrated Analysis of High Performance Computing Training Materials: A Fusion of Web Scraping, Machine Learning, and Statistical InsightsApply

Advancing Education in High Performance Computing: Exploring Personalized Teaching Strategies and Adaptive Learning TechnologiesApply

Evaluating Pedagogical Strategies in High Performance Computing Training: A Machine Learning-driven Investigation into Effective Didactic ApproachesApply

Reimagining and Porting a Prototype for High Performance Computing Certification: Enhancing Knowledge and Skills ValidationApply

Implementation of a precice-Adapter for the particle transport simulator LIGGGHTSApply

Enabling particle simulations with a deformable boundaryApply

Framework for automated ML and empirical model generationApply

An Agentic Retrieval-Augmented Generation Framework for Modular Knowledge PipelinesApply

A Collaborative Human–Agent RAG Chat System with Role-Aware Reasoning and Expert-Owned KnowledgeApply

Regulation-Aware AI Supervision: A RAG-Based Evaluation and Policy Enforcement FrameworkApply

AgentFlow: A Modular Pipeline for Coordinated Collaboration of AI AgentsApply

Advanced Retrieval-Augmented Generation: Improving Quality, Latency, and AdaptabilityApply

Federated Fine-Tuning of Large Language Models in Distributed and Privacy-Preserving EnvironmentsApply

Agentic Retrieval-Augmented Generation System for Modular Knowledge PipelinesApply

Segment-Wise Sequential Fine-Tuning of Large Language Models Under Memory ConstraintsApply

Performance optimization of numerical simulation of condensed matter systemsApply

Benchmarking Applications on Cloud vs. HPC SystemsApply

Putting RISC-V eval board Linux and HPC toolchains into operationApply

Comparison of Distributed Computing FrameworksApply

Performance Evaluation of LLM Inference EnginesApply

Operating Kubernetes with AI EngineersApply

Prototyping a Geo-Redundancy EngineApply

Development of a new application for the SpiNNaker-2 neuromorphic computing platformApply

Development of Text-to-SQL/XML Conversational AI for Planarian Research DatabaseApply

Identifying HPC code patterns suitable for optimizationApply

Performance Evaluation of Physics Mini-AppsApply

Performance Evaluation of Jacobi Iterative SolutionApply

Performance Evaluation of Shallow Water CodeApply