====== Seminar: Newest Trends in High-Performance Data Analytics ====== High-Performance Data Analytics is a vehicle to extract findings from large data sets. It is an indispensable tool in science and business but a rapidly changing field. As part of this seminar, you will create a presentation and report revolving around a selected hot topic in German or English. You will learn to research literature and may conduct small experiments to provide a holistic view of the selected topic. You will meet regularly with an assigned supervisor and work towards the presentation and report. ===== Key information ===== || Contact || [[about:people:julian_kunkel|Julian Kunkel]], [[about:people:jonathan_decker|Jonathan Decker]] || || Location || [[https://meet.gwdg.de/b/jul-gpr-4ao-ndv|Virtual in BBB]] || || Time || Thursday 16-18, First meeting: 2022-04-21 || || Language || English or German (individual presentation) || || Module || M.Inf.1237: Seminar Neueste Trends in High-Performance Data Analytics || || SWS || 2 || || Credits || 5 || || Contact time || 28 hours || || Independent study || 122 hours || {{ :teaching:summer_term_2022:seminar_neueste_trends_in_high-performance_data_analytics.pdf |Module description}} As part of this seminar, you will create a presentation (and report) revolving around a research topic in German or English (your choice!). Therefore, you will meet regularly with an assigned supervisor and work towards the presentation and report. This seminar is also available as a pro-seminar. As pro-seminar, the focus will be on learning presentation techniques while in the seminar your focus must be on presenting scientific facts and leading a scientific discussion. There are also two additional mandatory sessions for pro-seminar attendees (optional for seminar attendees). The presentation time is 35 minutes (plus discussion). A short report accompanying the slides is expected (max 15 pages). ===== Learning Objectives ===== The students will be able to * Appraise research in the area of high-performance data analytics * Compose a presentation covering their selected topic in depth * Evaluate findings (tools or theory) of other researchers * Explain theory and application covering their topic ===== Topics ===== This is the list of topics that we will assign to students during the first meeting. You will have some room for developing the topic in the direction of your choice. Feel free to propose your own great topic. * GPU Computing with Triton * https://openai.com/blog/triton/ * Seagate CORTX storage system * FPGA Computing with SciEngine * RISC-V: State of the union * Regression Testing for HPC * Global Optimization (of Clusters) with Genetic Algorithms * Julia Programming Language * RUST Programming for HPC application * KPI4DC - Key performance indicators for data centers * https://www.umweltbundesamt.de/sites/default/files/medien/5750/publikationen/2021-06-17_texte_94-2021_green-cloud-computing.pdf * The HPC Community (for Proseminar) * Benchmarking of HPC Systems * History and Development of System Architectures * Security in Cloud and HPC * DevOps strategies in HPC * Infiniband DPU * OneAPI for heterogeneous computing (CPU, GPU, FPGA) * Convergence of HPC and High-Performance Data Analytics * Using Data Analytics in HPC Applications * GPU Computing with Python * What's new with Spark 3 * What's new with Tensorflow * Development in data lakes and data warehousing * Trends in edge computing * Key-value stores for HPDA * Object storage systems * HPDA Benchmarks * Using R for HPDA * Security in Cloud and HPC ===== Examination ===== The exam is conducted as part of the presentation (50% of the mark) and report (50%). The focus for pro-seminars lies in the effective presentation while the focus for seminars is the depth of the scientific topic (slightly different marking schemes). ===== Agenda ===== * 2022-04-21 **Preliminary discussion / Vorbesprechung** -- Julian Kunkel \\ {{ :teaching:summer_term_2022:hpda22-welcome.pdf |Slides}} \\ If you cannot attend contact us asap! * Short introduction to the topics of the seminar. * Organizational matters: How to get good marks. * Assignment of topics to the participants on a first-come-first-served basis. * Talk: Professional presentation * 2022-05-05 **How to create professional presentations and reports?** -- //Julian Kunkel, Jonathan Decker// \\ This session is mandatory for pro-seminar attendees. * 45 min - In smaller groups, we will assess previous talks and reports - 2 rounds * https://hps.vi4io.org/teaching/hamburg/wintersemester_2017_2018/neueste_trends_in_big_data_analytics * Online Machine Learning * Text analysis and natural language processing * https://hps.vi4io.org/teaching/hamburg/sommersemester_2016/softwareentwicklung_in_der_wissenschaft * Testen * 15 min - You will have to prepare and give short 3-slide presentation (4-5 min) about a familiar topic, e.g., * Describe the concept of a computer to adults * Describe the life cycle of a plant (to adults) * Describe how to peel a banana * 20 min - Discuss your 3-slide talks in a reflection: your rationales and logic behind the structure and content * 10 min Introduction to our templates for presentations and the report using ShareLaTeX -- //Jonathan Decker// * 2022-05-12 * 2022-05-19 * Real-Time data analysis in education -- Lorenz Glißmann ((Betreuung: Julian Kunkel)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_real-time_data_analysis_in_education_by_lorenz_glissmann.pdf |Download}} * GPU Computing with Triton -- Dimitris Oikonomou ((Betreuung: Jonathan Decker)) * 2022-06-02 * 2022-06-09 * RISC-V: State of the union -- Ilia Kurin ((Betreuung: Christian Köhler)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_risc_v_by_ilia_kurin.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_risc_v_by_ilia_kurin.pdf |Download}} * 2022-06-16 * GPU Computing with Python -- Sören Metje ((Betreuung: Tino Meiselt)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_gpu_computing_with_python_by_soeren_metje.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_gpu_computing_with_python_by_soeren_metje.pdf |Download}} * OneAPI for heterogeneous computing -- Vincenz Dumann ((Betreuung: Vanessa End)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_oneapi_by_vincenz_dumann.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_oneapi_by_vincenz_dumann.pdf |Download}} * 2022-06-23 * Using R for HPDA -- Celine Thorns ((Betreuung: Hauke Gronenberg)) * Julia Programming Language -- Anna Kahle ((Betreuung: Marcus Merz)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_julia_by_anna_kahle.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_julia_by_anna_kahle.pdf |Download}} * 2022-06-30 * Using Data Analytics in HPC Applications -- Theint Hay Thi Maung ((Betreuung: Nils Kanning)) * Key performance indicators for data centers -- Tim van den Berg KPI4DC ((Betreuung: Laura Endter)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_kpi4dc_by_tim_vandenberg.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_kpi4dc_by_tim_vandenberg.pdf |Download}} * 2022-07-07 * RUST Programming for HPC application -- Yuvraj Singh ((Betreuung: Christian Boehme)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_rust_for_hpc_by_yuvraj_singh.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_rust_for_hpc_by_yuvraj_singh.pdf |Download}} * 2022-07-14 * What's new with Spark 3 -- Abdul Rafay ((Betreuung: Patrick Michaelis)) * Slides: {{ :teaching:summer_term_2022:nthpda_presentation_spark_3_by_abdul_rafay.pdf |Download}} * Report: {{ :teaching:summer_term_2022:nthpda_report_spark_3_by_abdul_rafay.pdf |Download}} * 2022-07-21 * 2022-07-28 * Security in Cloud and HPC -- Nicolas Alqas-Alyas ((Betreuung: Tim Ehlers)) * 2022-09-30 Deadline for the submission of the report