Hauke Kirchner

hauke_kirchner.jpg

Since February 1, 2022, Mr. Hauke Kirchner has been working as a Data Scientist in the working group “Computing” (AG C). On the one hand, he will support his colleagues in the project “FOREST-CARE” in remote sensing, and on the other hand, he will work at the interface between HPC and machine learning. After completing his Bachelor’s degree in Biology at the Georg-August-University Göttingen, he specialized in Remote Sensing during his master’s studies in Forest Information Technology at the University for Sustainable Development Ewerswalde and the Warsaw University of Life Sciences (SGGW). In his master’s thesis at the Helmholtz Centre for Environmental Research in Leipzig, he analyzed airborne lidar data with the aim of single tree detection and species classification on large areas.

  • Data Science
  • Remote Sensing
  • Data Management
  • High-Performance Computing

How to efficiently access free earth observation data for data analysis on HPC Systems?Apply

In recent years the availability of freely available Earth observation data has increased. Besides ESA's Sentinel mission [1] and NASA's Landsat mission [2], various open data initiatives have arisen. For example, several federal states in Germany publish geographical and earth observation data, such as orthophotos or lidar data, free of charge [3,4]. However, one bottleneck at the moment is the accessibility of this data. Before analyzing this data, researchers need to put a substantial amount of work into downloading and pre-processing this data. Big platforms such as Google [5] and Amazon [6] offer these data sets, making working in their environments significantly more comfortable. To promote and simplify data analysis in earth observation on HPC systems, approaches for convenient data access need to be developed. In a best-case scenario, the resulting data is analysis-ready so that researchers can directly jump into their research. The goal of this project is to explore the current state of services and technologies available (data cubes [7], INSPIRE [8], STAC [9]) and to implement a workflow that provides a selected data set to users of our HPC system. [1] https://sentinels.copernicus.eu/ [2] https://landsat.gsfc.nasa.gov/ [3] https://www.geoportal-th.de/de-de/ [4] https://www.geodaten.niedersachsen.de/startseite/ [5] https://developers.google.com/earth-engine/datasets [6] https://aws.amazon.com/de/earth/ [7] https://datacube.remote-sensing.org/ [8] https://inspire.ec.europa.eu/ [9] https://stacspec.org/

Performance optimization of deep learning model training and inferenceApply

Recent advances in deep learning, such as image (Rombach et al. 2022) and text generation (OpenAI 2023), have led to an increase in the number of AI publications in the world (Zhang et al. 2022). The breakthrough in deep learning is only possible because of evolving hardware and software that allows the processing of big data sets efficiently. Further, most of the accuracy gains of these models result from increasingly complex models (Schwartz et al. 2019). From 2013 to 2019, the required computing power for training deep learning models increased by a factor of $300,000$ (Schwart 2019). Therefore, performance optimization of deep learning model training and inference is highly relevant. Profiling with tools such as DeepSpeed [1] and the in-build PyTorch Profiler [2] helps identify the existing model's bottlenecks. Different optimization strategies, such as data and model parallelism, could be applied depending on the profiling results. Further, tools such as PyTorch Lightning's trainer [3] and Horovod [4] can be tested to use the cluster's resources efficiently. [1] https://github.com/microsoft/DeepSpeed [2] https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html [3] https://lightning.ai/docs/pytorch/latest/accelerators/gpu_intermediate.html [4] https://github.com/horovod/horovod Dodge, Jesse et al. (2022). Measuring the Carbon Intensity of AI in Cloud Instances. doi: 10.48550/ARXIV.2206.05229. url: https://arxiv.org/abs/2206.05229. OpenAI (2023). GPT-4 Technical Report. arXiv: 2303.08774 [cs.CL]. Rombach, Robin et al. (2022). “High-Resolution Image Synthesis with Latent Diffusion Models”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). url: https://github.com/CompVis/latent-diffusionhttps: //arxiv.org/abs/2112.10752. Schwartz, Roy et al. (2019). “Green AI”. In: CoRR abs/1907.10597. arXiv: 1907.10597. url: http://arxiv.org/abs/1907.10597. Zhang, Daniel et al. (2022). The AI Index 2022 Annual Report. arXiv: 2205.03468 [cs.AI].

  • Tree species classification from airborne LiDAR using individual crown delineation and machine learning, Hauke Kirchner (Master's Thesis), Advisors: Dr. N. Knapp, , BibTeX
  • Taxonomical read assignment with filtered spaced \ word matches at varying taxonomic levels, Hauke Kirchner (Bachelor's Thesis), Advisors: Prof. Morgenstern, , BibTeX

All publications as BibTex

  • Impressum
  • Privacy
  • about/people/hauke_kirchner.txt
  • Last modified: 2023-08-28 10:40
  • by 127.0.0.1