Machine Learning
KIB-MLRN
P221-0085
ki2
2
V
2
U
5
6
nein
Englisch
Klausur
KI575
Kommunikationsinformatik
6
Wahlpflichtfach
KIB-MLRN
Kommunikationsinformatik
6
Wahlpflichtfach
KIB-MLRN
Kommunikationsinformatik
6
Wahlpflichtfach
PIBWI19
Praktische Informatik
6
Wahlpflichtfach
PIB-MLRN
Praktische Informatik
6
Wahlpflichtfach
Die Präsenzzeit dieses Moduls umfasst bei 15 Semesterwochen 60 Veranstaltungsstunden (= 45 Zeitstunden). Der Gesamtumfang des Moduls beträgt bei 5 Creditpoints 150 Stunden (30 Stunden/ECTS Punkt). Daher stehen für die Vor- und Nachbereitung der Veranstaltung zusammen mit der Prüfungsvorbereitung 105 Stunden zur Verfügung.
Prof. Dr. Klaus Berberich
kbe
Prof. Dr. Klaus Berberich
kbe
Students know about fundamental supervised and unsupervised methods from machine learning. This includes methods for regression, classification, and clustering. Students understand how these methods work and know how to use existing implementations (e.g., in libraries such as scikit-learn). Given a practical problem setting, students can choose a suitable method, apply it to the dataset at hand, and assess the quality of the determined model. Students are aware of typical data-quality issues and know how to resolve them.
Machine learning plays an increasingly important role with applications ranging from recognizing handwritten digits, via filtering out unwanted span e-mails, to ranking of results in modern search engines. This course covers fundamental supervised and unsupervised methods from machine learning. We will look into how these methods are defined formally, including the mathematics behind them. Moreover, we will apply all methods on concrete datasets to solve practical problems. For this, we will rely on existing libraries (e.g., scikit-learn) that provide efficient implementations of the methods. The course is accompanied by theoretical exercises and project assignments. The former help students to deepen their understanding of the methods; the latter encourage students to solve practical problems by applying what they learnt in the course on real-world datasets.
1. Introduction
- What is Machine Learning?
- Applications
- Libraries
- Literature
2. Working with Data
- Typical data formats (e.g., CSV, spreadsheets, databases)
- Data quality issues (e.g., outliers, duplicates)
- Scales of measures (i.e., nominal, ordinal, numerical)
- Data pre-processing (in Python and using UNIX commandline tools)
3. Regression
- Ordinary least squares
- Multiple linear regression
- Non-linear regression
- Evaluation
4. Classification
- Logistic regression
- k-Nearest Neighbors
- Naive Bayes
- Decision Trees
- Neural Networks
- Evaluation
5. Clustering
- k-Means and k-Medoids
- Hierarchical agglomerative/divisive clustering
- Evaluation
6. Outlook
- Ongoing research
- Competitions (e.g., Kaggle and KDD Cup)
- Other resources (e.g., KDnuggets)
A. Burkov: The Hundred-Page Machine Learning Book,
self published, 2019
http://themlbook.com
G. James, D. Witten, T. Hastie, R. Tibshirani: An Introduction to Statistical Learning - with Applications in R,
Springer, 2015
S. Raschka and V. Mirjalili: Python Machine Learning,
Packt Publishing, 2019
M. J. Zaki und W. Meira Jr.: Data Mining and Analysis: Fundamental Concepts and Algorithms,
Cambridge University Press, 2014
SS 2024
SS 2023
SS 2022
SS 2021
SS 2020
Thu Mar 28 10:47:29 CET 2024, CKEY=kml, BKEY=ki2, CID=[?], LANGUAGE=de, DATE=28.03.2024