NWI-IBI008
Data Mining
Course infoSchedule
Course moduleNWI-IBI008
Credits (ECTS)6
CategoryBA (Bachelor)
Language of instructionEnglish
Offered byRadboud University; Faculty of Science; Informatica en Informatiekunde;
Lecturer(s)
Lecturer
dr. ir. T. Claassen
Other course modules lecturer
Examiner
prof. dr. T.M. Heskes
Other course modules lecturer
Lecturer
prof. dr. T.M. Heskes
Other course modules lecturer
Coordinator
prof. dr. T.M. Heskes
Other course modules lecturer
Contactperson for the course
prof. dr. T.M. Heskes
Other course modules lecturer
Academic year2019
Period
KW1-KW2  (02/09/2019 to 02/02/2020)
Starting block
KW1
Course mode
full-time
Remarks-
Registration using OSIRISYes
Course open to students from other facultiesYes
Pre-registrationNo
Waiting listNo
Placement procedure-
Aims
At the end of the course you will be able to
  • reason and argue which data mining algorithm is applicable to which task;
  • apply, analyze, and implement various data mining algorithms;
  • evaluate the quality of data mining solutions.
Content
How can we build systems that can learn? More specifically: how can we extract relevant, interesting information from (big) data? You learn that there are various algorithms, depending on the task at hand and properties of the available data. In the project, you will implement and/or test such algorithms on existing data.
Level

Presumed foreknowledge
Je
  • bent bekend met elementaire begrippen uit de kansrekening als kansen, kansverdelingen en verwachtingswaarden;
  • kunt eenvoudige berekeningen hiermee uitvoeren;
  • weet wat vectoren en matrices zijn;
  • kunt deze optellen en met elkaar vermenigvuldigen. Deze voorkennis komt aan de orde in de cursussen Calculus and Probability Theory en Matrix Calculation
  • Test information
    De beoordeling wordt gebaseerd op een midterm tentamen (voor 35%), een eindterm tentamen (voor 35%) en een werkstuk (voor 30%). Huiswerkopdrachten zijn verplicht en moeten voldoende worden gemaakt. Hertentamen komt in de plaats van zowel midterm als eindterm tentamen en telt dan voor 70%.
    Specifics

    Topics
    We will treat various tasks with corresponding algorithms:

    • explorative data analysis (histograms, boxplots, principal component analysis);
    • descriptive models (clustering, association analysis, probabilistic models);
    • classification (decision trees, naive Bayes classifiers, nearest neighor classifiers, neural networks).

    On the fly, we will discuss various basic principles such as distance measures, (Bayesian) probability theory, cross-validation, and bootstrapping.

    Test information
    Grading is based upon a midterm exam (35%), and endterm exam (35%), and a project (30%). Homework assignments are mandatory and a sufficient grade is needed to pass the course. A single resit exam replaces both midterm and endterm exams and then counts for 70%.

    Prerequisites
    You

    • are up-to-date with elementary concepts from probability theory such as probabilities, probability distributions, and expectations;
    • can apply these concepts for basic calculations;
    • know and understand vectors and matrices;
    • can add and multiply those.

    This prior knowledge is treated in the courses Calculus and Probability Theory and Matrix Calculation

    Recommended materials
    Book
    The course is originally based on the first edition of the book (which also can be found online in pdf), but the moves more and more in the direction of the second edition. It is yet unclear whether the second edition will be available in time.
    Title:Introduction to Data Mining
    Author:Tan, Steinbach, (Karpatne, )and Kumar
    Publisher:Pearson
    Edition:2

    Instructional modes
    Course occurrence

    Lecture

    Practical computer training

    Zelfstudie

    General
    We will have class-room lectures for the theoretical concepts. In the exercise sessions you can work on the homework exercises, mainly on programming and applying data mining algorithms in Python. In the project, you will get further practical experience on applying data mining algorithms to actual data.

    Tests
    Final grade
    Test weight1
    OpportunitiesBlock KW2, Block KW3

    Digital Midterm
    Test weight0
    Test typeDigital exam with CIRRUS
    OpportunitiesBlock KW1

    resit
    Test weight0
    Test typeExam
    OpportunitiesBlock KW2, Block KW3