Statistical learning: from parametric to nonparametric models






Emilie Devijver and Sana Louhichi


This course is related to mathematical and statistical methods which are very used in supervised learning.

It contains two parts.

In the first part, we will focus on parametric modeling. Starting with the classical linear regression, we will describe several families of estimators that work when considering high-dimensional data, where the classical least square estimator does not work. Model selection and model assessment will particularly be described.

In the second part, we shall focus on nonparametric methods. We will present several tools and ingredients to predict the future value of a variable. We shall focus on methods for non parametric regression from independent to correlated training dataset. We shall also study some methods to avoid the overfitting in supervised learning.

This course will be followed by practical sessions with the R software.

Course Outline

Introduction. Penalized linear methods for regression and classification. Non linear methods for regression. Cross Validation.


basic probability statistical inference, linear model.


High-dimension, Lasso, Ridge, Information Criteria, Mallows criterion, Cross validation, Nonparametric trend estimation, Kernel nonparametric models, Smoothing parameter selection, Average squared error, Mean average squared error, Generalized cross validation, Dependent random variables, Martingale difference sequences, Stochastic Volatility, Moment inequalities, Maximal inequalities. Supervised classification.