Smart Data Analytics for Climate/Weather workflows
Begin | Anytime |
First Supervisor | Dr. Julian Kunkel |
Second Supervisor | Bryan Lawrence |
Collaboration |
This project will benefit from the tight support of NVIDIA in the assistance and supervision of the candidate.
If you are interested in this topic or similar topics, contact Dr. Julian Kunkel.
Description
Efficient post-processing of climate and weather data is key for the data analysis. At the moment, scientists use toolkits from Python like Pangeo and command line tools like CDO. The command line tools suffer often from limited parallelism and Python tools are not suitable for on-line data processing and the integration of data analytics via artificial intelligence is lacking and inefficient.
Goal of this thesis is to develop and realize concepts and improved tool(s) that enable efficient post-processing of huge data volumes for climate/weather in nearline.
This encompasses
- the node-local efficient processing via GPUs,
- concepts for scalable processing of massive data volumes in a cluster that at best can be run concurrently with applications (in-transit processing),
- the connection of AI analytics into the workflow.
The work will be embedded in the ACES research group and conducted in tight collaboration with NVIDIA along the research project ESiWACE2. It will be integrated into a bigger vision for future storage and compute interfaces that supports scientists from climate and weather but also other domain scientists.
Methods
The research tasks and methods will cover:
- Domain-specific Middleware for storage and compute
- Smart scheduling using Machine Learning to distribute the workloads efficiently across a heterogeneous landscape
- Modelling of system and application performance behavior
- Evaluation methods
Training Opportunities
We integrate you into an excellent network of storage researchers, machine learning experts, and domain scientists (meteorology). Opportunities of trainging cover:
- Training and support opportunities from NVIDIA
- The principles and application of High-Performance Computing
- Understanding of the domain of climate/weather
- The University of Reading development program for PhD's
- Various academic relevant skills
- Soft-skills
Expected Knowledge
The knowledge expected from a successful applicant is:
- Basic knowledge about machine learning methods
- Programming languages: Python and C (intermediate level)
- Linux (intermediate)
Generally, we expect from all PhD candidates to be eager to learn new skills and improve upon existing skills. A PhD candidate should bring a good starting point of soft skills (in decreasing order of importance):
- Teamwork
- Problem-Solving
- Work Ethic
- Communication
- Time management
- Interpersonal skills