====== Seminar with Practical: Scalable Computing Systems and Applications in AI, Big Data and HPC ======
===== Key information =====
|| Contact || [[about:people:julian_kunkel|Julian Kunkel]], [[about:people:jonathan_decker|Jonathan Decker]] ||
|| Location || [[https://meet.gwdg.de/b/jul-cal-vop-2bi|Virtual]] ||
|| Time || Thursday 14:15-15:45 ||
|| Language || English or German (individual presentation) ||
|| Module || M.Inf.1238: Scalable Computing Systems and Applications in AI, BigData and HPC ||
|| SWS || 3 ||
|| Credits || 5 ||
|| Contact time || 42 hours ||
|| Independent study || 108 hours ||
As part of this seminar, you will create a presentation, work on a small-scale practical project and write a report revolving around a research topic in German or English (your choice!).
Therefore, you will meet regularly with an assigned supervisor and work towards the presentation, practical project and report.
You will first select a topic and a use case related to the overall topic of the course.
Then, during the term you will prepare a presentation to introduce the topic and the state of the art.
Next, you will realize a small-scale project by practically working on your topic.
This includes evaluating performance and scalability, as well as analyzing and quantifying the contribution of your topic or tool.
Finally, you present your results in another presentation.
The presentation time is 25 minutes (plus discussion) for each presentation.
A short report describing your work in the practical project is expected (max 15 pages).
Please note that we plan to record sessions (lectures and seminar talks) with the intent of providing the recordings
via BBB to other students but also to publish and link the recordings on YouTube for future terms.
If you appear in any of the recordings via voice, camera or screen share, we need your consent to publish the recordings.
See also this {{ :teaching:templates:dataprivacy_student_notice_slide.pdf |Slide}}.
==== Required Prior Knowledge ====
* No skills/knowledge is required
* Understanding of Linux basics and having used Linux before and being able to operate a Bash shell is beneficial
* We will provide a short crash course at the beginning of the course and link supplementary training material
===== Learning Objectives =====
* Describe approaches for the development of scalable systems and applications
* Sketch efficient algorithms and concepts
* Analyze and summarize state-of-the-art concepts, tools and research papers
* Deliver a technical presentation for a professional audience
* Explore and apply concepts or tools to improve scalability for a selected use case
* Quantify efficiency and scalability of selected use cases
===== Topics =====
This is the list of topics that we will assign to students during the first meeting.
You will have some room for developing the topic into the direction of your choice.
Feel free to propose your own great topic.
* Performance Analysis using Scalasca and Vampir
* Data Streaming and Workflows using Apache Airflow
* Scalable data lakes
* Evaluating the ARM Architecture for HPC
* Understanding GPU performance e.g. using MLCommons ML Benchmarks
* Usage of data lakes and/or data warehouses
* Scalable quantum computer simulation on HPC systems
* TBD
===== Examination =====
The exam is conducted as part of the final presentation (30% of the mark) and the report (70%).
===== Agenda =====
* 26.10.23 **Preliminary discussion / Vorbesprechung** -- //Julian Kunkel, Jonathan Decker// \\ If you cannot attend contact us asap!
* Short introduction to the topics of the seminar.
* Short introduction of the overall timeline for seminar and practical part.
* Organizational matters: How to get good marks.
* Assignment of topics to the participants on a first-come-first-served basis.
* Talk: Professional presentation {{ :teaching:autumn_term_2023:scap:scap-welcome.pdf |Slides}}
* 02.11.23 **How to create professional presentations and reports?** -- //Julian Kunkel, Jonathan Decker//
* Introducing our report template and usage (very quick intro to LaTeX) {{ :teaching:autumn_term_2023:scap:latex-intro.pdf |Slides}}
* Discussion of existing reports and presentations individually and in the group
* 09.11.23
* 16.11.23
* 23.11.23 **Project topic presentations**
* Mohd Uwaish - Understanding GPU performance e.g. using MLCommons ML Benchmarks
* 30.11.23 **Project topic presentations**
* Jule Anger - Kubernetes for HPC
* Lukas Steinegger - Load Balancing or Authorization
* 07.12.23 **Project topic presentations**
* Robin Lösekrug - Usage of data lakes and/or data warehouses
* Claas Kochanke - Performance Analysis using Scalasca and Vampir
* 14.12.23 **Project topic presentations**
* Laura Plodek - Scalable Deep Learning Models
* Esther Hagenkort - Machine learning performance and behavior of HPC storage systems
* 21.12.23 **Project topic presentations**
* David Alexandre Silva - Quantum Neural Networks: Libraries and Applications
warehousing
* 11.01.24 **Project result presentations**
* Mohd Uwaish - Understanding GPU performance e.g. using MLCommons ML Benchmarks
* 18.01.24 **Project result presentations**
* Jule Anger - Kubernetes for HPC
* Lukas Steinegger - Load Balancing or Authorization
* 25.01.24 **Project result presentations**
* Sonal Lakhotia - Usage of data lakes and/or data warehouses/Development in data lakes and data warehousing
* Robin Lösekrug - Usage of data lakes and/or data warehouses
* 01.02.24 **Project result presentations**
* Laura Plodek - Scalable Deep Learning Models
* Esther Hagenkort - Machine learning performance and behavior of HPC storage systems
* Claas Kochanke - Performance Analysis using Scalasca and Vampir
* 08.02.24 **Project result presentations**
* David Alexandre Silva - Quantum Neural Networks: Libraries and Applications
* Sonal Lakhotia - Usage of data lakes and/or data warehouses/Development in data lakes and data
* 31.03.23 **Deadline for the submission of the report**
===== Topic Distribution =====
|| **Student** || **Supervisor** || **Topic** || **Submissions** ||
|| Mohd Uwaish || Patrick Höhn || Understanding GPU performance e.g. using MLCommons ML Benchmarks ||
|| Claas Kochanke || Jack Ogaja || Performance Analysis using Scalasca and Vampir || {{ :teaching:autumn_term_2023:stud:scap_claas_kochanke.pdf |Report}} ||
|| Jule Anger || Jonathan Decker || Kubernetes for HPC || {{ :teaching:autumn_term_2023:stud:scap_jule_anger.pdf |Report}} {{ :teaching:autumn_term_2023:stud:scap_jule_anger_slides.pdf |Slides}} ||
|| Lukas Steinegger || Christian Köhler || Load Balancing or Authorization ||
|| Sonal Lakhotia || Aasish Kumar Sharma || Usage of data lakes and/or data warehouses/Development in data lakes and data warehousing ||
|| Robin Lösekrug || Giorgi Mamulashvili || Usage of data lakes and/or data warehouses ||
|| Laura Plodek || Chirag Mandal || Scalable Deep Learning Models ||
|| Esther Hagenkort || Patrick Höhn || Machine learning performance and behavior of HPC storage systems || {{ :teaching:autumn_term_2023:stud:scap_esther_hagenkort.pdf |Report}} {{ :teaching:autumn_term_2023:stud:scap_esther_hagenkort_slides.pdf |Slides}} ||
|| David Alexandre Silva || Christian Boehme || Quantum Neural Networks: Libraries and Applications || {{ :teaching:autumn_term_2023:stud:scap_david_alexandre_silva.pdf |Report}} ||