htw saar Piktogramm QR-encoded URL
Back to Main Page Choose Module Version:
emphasize objectives XML-Code

flag


Data Science

Module name (EN):
Name of module in study programme. It should be precise and clear.
Data Science
Degree programme:
Study Programme with validity of corresponding study regulations containing this module.
Applied Informatics, Master, ASPO 01.10.2017
Module code: PIM-DS
SAP-Submodule-No.:
The exam administration creates a SAP-Submodule-No for every exam type in every module. The SAP-Submodule-No is equal for the same module in different study programs.
P221-0051
Hours per semester week / Teaching method:
The count of hours per week is a combination of lecture (V for German Vorlesung), exercise (U for Übung), practice (P) oder project (PA). For example a course of the form 2V+2U has 2 hours of lecture and 2 hours of exercise per week.
3V+1U (4 hours per week)
ECTS credits:
European Credit Transfer System. Points for successful completion of a course. Each ECTS point represents a workload of 30 hours.
6
Semester: 1
Mandatory course: yes
Language of instruction:
German
Assessment:
Written exam

[updated 24.02.2018]
Applicability / Curricular relevance:
All study programs (with year of the version of study regulations) containing the course.

DFI-DS (P610-0280) Computer Science, Master, ASPO 01.10.2018 , semester 1, mandatory course
KIM-DS (P221-0051) Computer Science and Communication Systems, Master, ASPO 01.10.2017 , optional course, informatics specific
PIM-DS (P221-0051) Applied Informatics, Master, ASPO 01.10.2017 , semester 1, mandatory course
Workload:
Workload of student for successfully completing the course. Each ECTS credit represents 30 working hours. These are the combined effort of face-to-face time, post-processing the subject of the lecture, exercises and preparation for the exam.

The total workload is distributed on the semester (01.04.-30.09. during the summer term, 01.10.-31.03. during the winter term).
60 class hours (= 45 clock hours) over a 15-week period.
The total student study time is 180 hours (equivalent to 6 ECTS credits).
There are therefore 135 hours available for class preparation and follow-up work and exam preparation.
Recommended prerequisites (modules):
None.
Recommended as prerequisite for:
PIM-DL Deep Learning


[updated 15.11.2021]
Module coordinator:
Prof. Dr. Klaus Berberich
Lecturer:
Prof. Dr. Klaus Berberich


[updated 27.10.2016]
Learning outcomes:
After successfully completing this module, students will be able to use suitable methods of data analysis to gain knowledge for decision-making in practical questions. Students will become familiar with important data analysis procedures. They will be familiar with different types of characteristics (e. g. nominal, ordinal, metric) and can preprocess data appropriately (e. g. by normalization or standardization). Students will be able to select appropriate decision-making procedures (e.g. regression or classification) for specific problems. They will be able to implement the procedures they have learned in a suitable programming language (e. g. Python or R) or use an available implementation. Students will be able to systematically determine the parameters of the applied methods on the basis of available data and critically assess the quality of their results. They will be able to prepare the knowledge gained from the data appropriately (e. g. in the form of visualization) in order to make it understandable for a technically trained or non-technically trained audience (e. g. decision-makers in the company).

[updated 24.02.2018]
Module content:
1. Introduction
 
2. Regression
2.1 Linear regression
2.2 Feature transformation
2.3 Regularization
 
3. Classification
3.1 Logistic regression
3.2 Decision trees
3.3 Naive Bayes
3.4 Support vector machines
 
4. Cluster analysis
4.1 Representative method (k-Means und k-Medoids)
4.2 Hierarchical method
4.3 Density-based method
 
5. Neural networks
5.1 Perceptron
5.2 Multi-layer neural networks (MLPs)
5.3 Convolutional neural networks (CNNs)
5.4 Recurrent neural networks (RNNs)
 
6. 5.3 Association rule learning
6.1 Finding frequent item sets (Apriori and FP-Growth)
6.2 Determining association rules
6.3 Finding frequent sequences (GSP and PrefixSpan)
6.4 Finding frequent strings
6.5 Finding frequent subgraphs
 
7. Data visualization


[updated 24.02.2018]
Teaching methods/Media:
Transparencies, practical and theoretical exercises

[updated 24.02.2018]
Recommended or required reading:
Aggarwal C.: Data Mining - The Textbook, Springer, 2015
 
Harrington P.: Machine Learning in Action, Manning, 2012
 
Kelleher J., Mac Namee B. und D´Arcy A.: Fundamentals of Machine Learning for Predictive Data Analytics, MIT Press, 2015
 
Provost F. und Fawcett T.: Data Science for Business, O´Reilly, 2013
 
Raschka S.: Machine Learning mit Python, mitp, 2017
 
Zaki Mohammed J. und Meira Wagner Jr: Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014

[updated 24.02.2018]
Module offered in:
WS 2023/24, WS 2022/23, WS 2021/22, WS 2020/21, WS 2019/20, ...
[Tue Apr 23 23:39:54 CEST 2024, CKEY=pds, BKEY=pim2, CID=PIM-DS, LANGUAGE=en, DATE=23.04.2024]