NWI-IBI008
Data Mining
Course infoSchedule
Course moduleNWI-IBI008
Credits (ECTS)6
CategoryBA (Bachelor)
Language of instructionEnglish
Offered byRadboud University; Faculty of Science; Informatica en Informatiekunde;
Lecturer(s)
Lecturer
dr. ir. T. Claassen
Other course modules lecturer
Examiner
prof. dr. T.M. Heskes
Other course modules lecturer
Lecturer
prof. dr. T.M. Heskes
Other course modules lecturer
Coordinator
prof. dr. T.M. Heskes
Other course modules lecturer
Contactperson for the course
prof. dr. T.M. Heskes
Other course modules lecturer
Academic year2018
Period
KW1-KW2  (03/09/2018 to 27/01/2019)
Starting block
KW1
Course mode
full-time
Remarks-
Registration using OSIRISYes
Course open to students from other facultiesYes
Pre-registrationNo
Waiting listNo
Placement procedure-
Aims
At the end of the course you will be able to
  • reason and argue which data mining algorithm is applicable to which task;
  • apply, analyze, and implement various data mining algorithms;
  • evaluate the quality of data mining solutions.
Content

How can we build systems that can learn? More specifically: how can we extract relevant, interesting information from big data? You learn that there are various algorithms, depending on the task at hand and properties of the available data. In the project, you will implement and/or test such algorithms on existing data.

Topics
We will treat various tasks with corresponding algorithms:

• explorative data analysis (histograms, boxplots, principal component analysis);
• descriptive models (clustering, association analysis, probabilistic models);
• classification (decision trees, naive Bayes classifiers, nearest neighor classifiers, neural networks).

On the fly, we will discuss various basic principles such as distance measures, (Bayesian) probability theory, cross-validation, and bootstrapping.

Test information
Grading is based upon a midterm exam (35%), and endterm exam (35%), and a project (30%). Homework assignments are mandatory and a sufficient grade is needed to pass the course. A single resit exam replaces both midterm and endterm exams and then counts for 70%.

Prerequisites
You

• are up-to-date with elementary concepts from probability theory such as probabilities, probability distributions, and expectations;
• can apply these concepts for basic calculations;
• know and understand vectors and matrices;
• can add and multiply those.

This prior knowledge is treated in the courses Calculus en Kansrekening (Calculus and Probability Theory) and Matrixrekenen (Linear Algebra).

Recommended materials
Book
"Introduction to data mining", Tan, Steinbach and Kumar (highly recommended)

Instructional modes
Course occurrence

Lecture

Practical computer training

Zelfstudie

General
We will have class-room lectures for the theoretical concepts. In the exercise sessions you can work on the homework exercises, mainly on programming and applying data mining algorithms in Python. In the project, you will get further practical experience on applying data mining algorithms to actual data.

Tests
Midterm
Test weight0
Test typeExam
OpportunitiesBlock KW1

Final exam
Test weight0
Test typeExam
OpportunitiesBlock KW2, Block KW3

Project
Test weight0
Test typeProject
OpportunitiesBlock KW2, Block KW3

Homework
Test weight0
Test typeAssignment
OpportunitiesBlock KW2, Block KW3

Course grade including bonus
Test weight1
OpportunitiesBlock KW2, Block KW3