Advanced Statistical Methods

This course introduces the main notions, approaches, and methods of nonparametric statistics. The main topics include smoothing and regularization, model selection and parameter tuning, structural inference, efficiency and rate efficiency, local and sieve parametric approaches. The study is mainly limited to regression and density models. The topics of this course form an essential basis for working with complex data structures using modern statistical tools.

Instructor: Vladimir Spokoiny

Instructor assistant: Nikita Puchkin

Course page in Canvas: https://skoltech.instructure.com/courses/2337

 

Venue: Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30, bld. 1 Moscow, Russia 121205

Room B2-3006

Schedule:

February 11, February 25, March 3, March 24, March 31, Tuesday, 9:00 – 11:45

February 13, February 27, March 5, March 26, April 2, Thursday,9:00 – 11:45

 

All lecture materials can be found in the script (upd. 9.03).

If you have any questions about the course, please, write to asmcourse2018@gmail.com.

 

Questions for the exam

Questions for the exam are available here (upd. 3.04, the numeration was slightly changed). The exam will be held on April 4. The exam will be in online format. You will be asked to prepare some questions from the list. Those of you, who did not make a project or refuses from the grade for the project, must answer 3 questions from the list (1 question form each group). The students, who passed the project, must answer only 1 question from the third group.

Questions to prepare : link, exam schedule: link.    

 

Lecture notes

Lecture notes 24.03.: link.

Lecture notes 26.03: link, seminar 26.03: link.

Lecture notes 31.03: link, seminar 31.03: link

Lecture notes 2.04: link, seminar 2.04: link

Tentative syllabus

Lection 1
11.02 Tu
09:00 – 10:30 Lecture
MLE in linear models
– MLE approach and quadraticity
– Fisher and Wilks Theorem
– Estimation and prediction, bias-variance decomposition

10:45 – 11:45 Seminar

Lection 2
13.02 Th
09:00 – 10:30 Lecture
Penalized MLE for linear models
– penalized MLE for quadratic penalty
– quadraticity, generalized Fisher and Wilks Theorem
– effective dimension and quadratic risk
– impact of the penalty

Seminar
10: 45 – 11:45 Seminar

Lection 3
25.02 Tu
09:00 – 10:30 Lecture
Penalty choice
– oracle choice
– unbiased risk estimation and model selection by SURE
– penalized model selection
– cross-validation

10: 45 – 11:45 Seminar

Lection 4
27.02 Th
09:00 – 10:30 Lecture
Bayes approach
– linear models and Gaussian prior
– posterior and penalized MLE
– effective dimension and posterior concentration
– posterior contraction
– nonparametric Bayes, Bayesian credible sets

10: 45 – 11:45 Seminar

Lection 5
03.03 Tu
09:00 – 10:30 Lecture
Bayesian model selection
– empirical and full Bayes approach

10: 45 – 11:45 Seminar

Lection 6
05.03 Th
09:00 – 10:30 Lecture
Non-linear regression with additive noise and calming
– examples: non-linear regression and DNN, inverse problems, error-in-operator, error-in-variables
– calming approach
– properties of the penalized MLE
– Fisher and Wilks Theorem

10: 45 – 11:45 Seminar

Lection 7
24.03 Tu
09:00 – 10:30 Lecture
Generalized linear regression
– Examples: binary classification, Poisson regression, log-density estimation
– penalized MLE: Fisher and Wilks Theorem

10: 45 – 11:45 Seminar

Lection 8
26.03 Th
09:00 – 10:30 Lecture
Generalized linear regression:
– Bernstein – von Mises Theorem and elliptic credible sets
– Bayesian model selection: empirical and full Bayes approach

10: 45 – 11:45 Seminar

Lection 9
31.03 Tu
09:00 – 10:30 Lecture
Complexity penalization
– Akaiki criterion
– subset/feature selection with complexity penalty

10: 45 – 11:45 Seminar

Lection 10
02.04 Th
09:00 – 10:30 Lecture
Sparse penalty
– sparse penalty and sparse estimation
– LASSO and Dantzig selector
– RIP condition and risk bounds

10: 45 – 11:45 Seminar

 

Projects

All of you have an option to complete a project. The detailed project description and assessment criteria are given here (upd. 14.03., Steps 3 and 4 are fixed to avoid confusion).

 

Homework 6 (deadline: April 2, Thursday, 9:00)

The list of problems is available here.

Format: upload a Word/LaTeX-based PDF or a PDF scan of hand-written document in Canvas. Make sure that the scan is readable and can be checked by TA’s.

Homework 5 (deadline: March 5, Thursday, 9:00)

Complete the exercises 1.8.7, 6.4.1, 6.4.2, 6.4.3 from the script.

Format: upload a Word/LaTeX-based PDF or a PDF scan of hand-written document in Canvas. Make sure that the scan is readable and can be checked by TA’s.

Homework 4 (deadline: March 3, Tuesday, 9:00)

Complete the exercises 5.2.2, 5.2.3, 5.4.1, 5.4.2 from the script.

In the exercise 5.2.2, take the interval (0, 1) to check (5.6).

In the exercise 5.2.3, take the interval (0, 1) to compute the usual scalar product and take i from -n+1 to n to compute the scalar product for the equidistant design X_i = i/n.

Format: upload a Word/LaTeX-based PDF or a PDF scan of hand-written document in Canvas. Make sure that the scan is readable and can be checked by TA’s.

Homework 3 (deadline: February 27, Thursday, 9:00)

Complete the exercises 4.2.1, 4.2.2, 4.3.3 and 4.3.5 from the script.

Format: upload a Word/LaTeX-based PDF or a PDF scan of hand-written document in Canvas. Make sure that the scan is readable and can be checked by TA’s.

Homework 2 (deadline: February 25, Tuesday, 9:00)

Complete the exercises 1.4.1, 1.4.2, 1.6.1 and 1.6.3 from the script.

Format: upload a Word/LaTeX-based PDF or a PDF scan of hand-written document in Canvas. Make sure that the scan is readable and can be checked by TA’s.

Homework 1 (deadline: February 13, Thursday, 9:00)

Complete the exercises 1.2.3, 1.3.1, 1.4.4 and 1.4.6 from the script.

Format: upload a Word/LaTeX-based PDF or a PDF scan of hand-written document in Canvas. Make sure that the scan is readable and can be checked by TA’s.