Efficient Data Compression

Begin Anytime
Supervisor Dr. Julian Kunkel
Collaboration ECMWF

This work will be embedded in the ACES research group and conducted in tight collaboration with ECMWF.

If you are interested in this topic or similar topics, contact Dr. Julian Kunkel.

Efficient data compression is mandatory to deal with the data deluge of data-intensive science like climate/weather. In this work, we investigate the state of the practice at ECMWF, explore if and how existing compression schemes could improve upon this situation, and aim to develop a model that approximates the compression behavior depending on data characteristics.

Firstly, the current practice at ECMWF is researched and a test environment with used compression tools and example data files is setup. Next, the performance and compression ratio of compression schemes is measured and analyzed. Finally, a (machine-learning) model is developed to estimate performance and ratio depending on the data characteristics.

While all skills needed to complete this project can be obtained during the time of the project, some skills are beneficial:

  • Basic knowledge about machine learning methods
  • Programming languages: C (basic level)
  • Linux (intermediate)

An MSc candidate is expected to bring soft skills (in decreasing order of importance):

  • Communication
  • Problem-Solving
  • Time management
  • research/open-theses/msc/efficient-data-compression.txt
  • Last modified: 2019-02-24 12:20
  • (external edit)