Homepage ORCID 0000-0002-7384-7304
Jonathan is a scientific employee of the Georg-August-University of Göttingen and a postdoc researcher.
He takes the role of a system architect and is focused on designing systems that enable new and novel ways of utilizing Cloud and HPC resources, while also being efficient, secure and scalable. Most notably, he strives to combine HPC with Kubernetes.
In addition to conducting research in these topics, he handles university teaching activities.
While vLLM is a widely spread inference backend engine for operating LLMs, there are alternative options that have the potential to deliver better performance by replacing or extending vLLM. Notable options are the Modular platform with MAX, ServerlessLLM and LMCache. Performance improvements may be limited to certain use cases. The overarching goal of this topic is to explore potential performance improvements for the Chat AI platform.
Projects such as K8sGPT as well as MCP servers for Kubernetes enable LLMs to directly interact with Kubernetes clusters. This project aims to explore how well it is possible to maintain a given Kubernetes cluster with LLM-based engineers to complete typical maintenance tasks such as adjusting workloads and migrating between versions.
Personal AI assistants such as OpenClaw and Hermes are able to deal with a wide range of use cases, which is even further extended by skills and plugins. However, as they serve as general purpose assistants, a user might give them multiple tasks during a day that each are complex and cause the agent system to evict or compress their memory. Doing so causes information to be lost that a user expects to be present in the agent memory causing users to re-explain tasks and approaches. This thesis aims to enable personal AI assistants to manage tasks as sessions such that they can keep their memory or notes per task and understand when to load what session data.
All publications as BibTex