====== Efficient Data Compression ====== | Begin | Anytime | | Supervisor | [[about:people:julian kunkel]] | | Collaboration | ECMWF | This work will be embedded in the [[https://aces.cs.reading.ac.uk/|ACES research group]] and conducted in tight collaboration with ECMWF. If you are interested in this topic or similar topics, contact [[about:people:julian kunkel]]. ===== Description ===== Efficient data compression is mandatory to deal with the data deluge of data-intensive science like climate/weather. In this work, we investigate the state of the practice at ECMWF, explore if and how existing compression schemes could improve upon this situation, and aim to develop a model that approximates the compression behavior depending on data characteristics. ===== Methods ===== Firstly, the current practice at ECMWF is researched and a test environment with used compression tools and example data files is setup. Next, the performance and compression ratio of compression schemes is measured and analyzed. Finally, a (machine-learning) model is developed to estimate performance and ratio depending on the data characteristics. ===== Beneficial Knowledge ===== While all skills needed to complete this project can be obtained during the time of the project, some skills are beneficial: * Basic knowledge about machine learning methods * Programming languages: C (basic level) * Linux (intermediate) An MSc candidate is expected to bring soft skills (in decreasing order of importance): * Communication * Problem-Solving * Time management