Participants are expected to have knowledge of basic statistics, e.g. hypothesis testing, correlation and linear regression, and experience using R and RStudio.
Each day consists of lectures and practicals using R. Examples of datasets delivered by the course participants in advance will be used in the lectures and practicals when possible.
Day 1: Data pre-treatment, PCA, short review multiple linear regression and PCR.
Discussion of different data pre-treatment methods e.g. centering, autoscaling, pareto scaling, range scaling, log transformation, and data exploration using Principal Component Analysis, PCA, and regression using the principal components from PCA in Principal Component Regression, PCR.
Day 2: Advanced regression techniques and Model validation.
Discussion of Partial Least Squares, PLS, a technique similar to PCR but with improvements and regularized regression e.g. ridge/lasso, together with way of assessing model accuracy.
Day 3: Clustering and classification; k-means, hierarchical clustering
Discussion of cluster analysis: choice of similarity measure, agglomerative methods, divisive methods, k-means & hierarchical clustering.
Date & duration:
10, 11 and 12 June 2020
The study load of this course is 1.5 ECTS credits.
Details will be announced later.
Costs will be announced in due time
Registration will be open in due time