RSS03.18 Introduction to Data Science with R and Rstudio for the Social Sciences
Whether you are a social scientist, a business analysist or a data journalist, analysing data is key to greater understanding of the world around us. Whether it’s understanding political discussions online, the diffusion of news events, or the predictive power of certain indicators, systematic analysis improves business conduct, news analysis and reporting and understanding of human behaviour in general.
To be able to understand quantitative data we need tools of high quality and, if possible, open source and for free for Windows, Apple and Linux. R and Rstudio provide software applications for descriptive analysis, predictive, and causal analysis, and are increasingly adopted software tools for Data Science in academics, and businesses. R also provides numerous tools for publication ready visualization and publication of research results. As such, R and Rstudio are very versatile.
Main topics that will be discussed in the course are:
- Importing and reading data: how to read data from different sources: from disk, the web or online repositories (Eurostat), as well as importing from other statistical software packages (cf., SAS, SPSS)
- Cleaning data: raw data always need to be checked on consistency, error and missing and out-of-range values. The course will provide you with techniques on how to clean the data.
- The course will show you how to perform basic descriptive analyses, such as central tendency and dispersion (means, standard deviations), the associations between variables (cf. correlations), aided by use of visualization, such as barcharts, histograms, scatterplots and line graphs.
- In sociology and psychology, scale analysis are very common. The course will show you how to perform popular ways to conduct Likert and Mokken analysis, as well as multivariate analyses for data reduction and uncovering dimensional structures (cf., factor analysis, component analysis, correspondence analysis, Mokken analysis)
- In order to perfrom predictive and explanatory analyses, the course focuses on how to perform regression analysis, assessing and comparing the fit of models, as well as testing for assumptions of regression analysis.
- Some data science projects may involve many scripts and data files. The course will show how it is possible to easily manage your projects as well as making them portable, either for collaboration or for archiving purposes (for instance, on Github).
Dates |
3 July 2023 - 7 July 2023 |
Course Fee |
Regular: €600 Early Bird: €540 (application deadline* April 1st) |
Scholarships and discounts | Find more information here |
Application deadline |
May 1st *Your application is only completed when the course fee has been paid |
Course leader | Maurice Vergeer |
Level of participant |
|
Admission requirements | Participants need to have a basic understanding of statistics and how data generally are organized. |
Admission documents |
|
Mode of Study | On Campus |
ECTS | 2 |
Location |
To be determined |