MA8701 General Statistical Methods

Reading list with key concepts

Part 1: Regularized linear and generalized linear models (25%)

  • Hastie, Tibshirani, Wainwright: "Statistical Learning with Sparsity: The Lasso and Generalizations". The newest version of the ebook can be downloaded from the page of Trevor Hastie: https://trevorhastie.github.io/: Chapters 2.1-2.6, 2.9, 3.1-3.2, 3.7, 4.1-4.3, 4.5-4.6, 5.1, 5.4, 6.0,6.2
  • Single/multi-sampling splitting part of Dezeure, Bühlmann, Meinshausen (2015). "High-Dimensional Inference: Confidence Intervals, p-Values and R-Software hdi". Statistical Science, 2015, Vol. 30, No. 4, 533–558 DOI: 10.1214/15-STS527 (focus on the single/multiple sample splitting).

Key concepts:

  • Intro to lasso - Chapters 2.1-2.6, 5.1, 5.4: Linear regression, Why sparsity?, Least absolute shrinkage and selection operator (lasso) and related approaches, Fitting the model / coordinate descentfor lasso
  • GLM with regularization Chapters 2.9, 3.1-3.2, 3.7, 5.4: Generalized linear models (GLM), Logistic regession with l1 (example), Fitting the model
  • Generalizations of lasso – Chapter 4.1-4.3, 4.5-4.6: Elastic net, Relaxed lasso, Grouped lasso, Fused lasso, Non-convex penalties
  • Inference for lasso – Chapter 6.0, 6.2 and Dezeure et al., 2015: Bootstrap method, Multi sample-splitting

Part 2: Smoothing and splines (25%)

Key concepts:

  • Chapter 5: Linear basis expansion, Natural cubic spline, Smoothing spline, Degrees of freedom
  • Chapter 6: Kernel smoother, Local linear (and polynomial) regression, Kernel density estimation and classification, Radial basis function, Mixture model

Part 3: Experimental design in statistical learning (10%)

(to download you must be on NTNU vpn)

Key concepts

  • How to optimize hyperparameters
  • How to compare algorithms on the same dataset
  • How to compare algorithms on several dataset

Keywords:

Screening, Steepest ascent, CCD, BBD, Canonical analysis, Cross-validation. Paired t, One-way anova, Wilcoxon signed rank test, Friedman test.

Part 4: Deep neural nets (30%)

You choose if you read the R or Python version, both built on Keras.

Key concepts:

  • Sequentially layered networks: architecture, activation functions, loss function: matched to the problem (regression, 2 or more classes in classification)
  • Tensors, inner products and their use in neural nets
  • Backpropagations (for chain rule), minibatch stochastic gradient descent, choice of learning rate and variants
  • Regularization: weight decay, early stopping, drop-out
  • Recurrent neural networks: MORE
  • Convolutional neural networks: MORE

Part 5: Active learning (10%)

Key concepts:

What is the problem (we do not have much data where we know the response="little labelled data") - how can this be handled (overview) and in particular what are (some) solutions within the field of active learning (know about some strategies).

1. Strategies to handle little labelled data:

  • transfer learning,
  • data augmentation,
  • active learning,
  • semi-supervised and
  • multitask learning
  • model constraining
  • one-shot learning.

2. Active learning:

  • Pool based active learning
  • Stream based active learning
  • Sample selection strategies
  • Challenges with using an active learning strategy in practice
  • Case studies
2019-04-09, Mette Langaas