MA8701 General Statistical Methods

Reading list with key concepts

Part 1: Regularized linear and generalized linear models (25%)

  • Hastie, Tibshirani, Wainwright: "Statistical Learning with Sparsity: The Lasso and Generalizations". The newest version of the ebook can be downloaded from the page of Trevor Hastie: https://trevorhastie.github.io/: Chapters 2.1-2.6, 2.9, 3.1-3.2, 3.7, 4.1-4.3, 4.5-4.6, 5.1, 5.4, 6.0,6.2
  • Single/multi-sampling splitting part of Dezeure, Bühlmann, Meinshausen (2015). "High-Dimensional Inference: Confidence Intervals, p-Values and R-Software hdi". Statistical Science, 2015, Vol. 30, No. 4, 533–558 DOI: 10.1214/15-STS527 (focus on the single/multiple sample splitting).

Key concepts:

  • Intro to lasso - Chapters 2.1-2.6, 5.1, 5.4: Linear regression, Why sparsity?, Least absolute shrinkage and selection operator (lasso) and related approaches, Fitting the model / coordinate descentfor lasso
  • GLM with regularization Chapters 2.9, 3.1-3.2, 3.7, 5.4: Generalized linear models (GLM), Logistic regession with l1 (example), Fitting the model
  • Generalizations of lasso – Chapter 4.1-4.3, 4.5-4.6: Elastic net, Relaxed lasso, Grouped lasso, Fused lasso, Non-convex penalties
  • Inference for lasso – Chapter 6.0, 6.2 and Dezeure et al., 2015: Bootstrap method, Multi sample-splitting

Part 2: Smoothing and splines (25%)

Key concepts:

  • Chapter 5: Linear basis expansion, Natural cubic spline, Smoothing spline, Degrees of freedom
  • Chapter 6: Kernel smoother, Local linear (and polynomial) regression, Kernel density estimation and classification, Radial basis function, Mixture model

Part 3: Experimental design in statistical learning (10%)

(to download you must be on NTNU vpn)

On reading list:

The following two articles is strictly not on the reading list, but illustrates the theory well:

  • Article: Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study (2018), Gustavo A. Lujan-Moreno, Phillip R. Howard, Omar G. Rojas, Douglas Montgomery, Expert Systems with Applications, Volume 109, https://doi.org/10.1016/j.eswa.2018.05.024
  • Article: Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies (2012), Ozan Irsoy ; Olcay Taner Yildiz ; Ethem Alpaydin, IEEE/ACM Transactions on Computational Biology and Bioinformatics ( Volume: 9 , Issue: 6 , Nov.-Dec. 2012 ) https://doi.org/10.1109/TCBB.2012.117

Key concepts

  • How to optimize hyperparameters
  • How to compare algorithms on the same dataset
  • How to compare algorithms on several dataset

Keywords:

Screening, Steepest ascent, CCD, BBD, Canonical analysis, Cross-validation. Paired t, One-way anova, Wilcoxon signed rank test, Friedman test.

Part 4: Deep neural nets (30%)

All chapters except Chapter 8: Generative deep learning.

You choose if you read the R or Python version, both built on Keras.

Key concepts:

  • Sequentially layered networks: architecture, activation functions, loss function: matched to the problem (regression, 2 or more classes in classification)
  • Tensors, inner products and their use in neural nets
  • Backpropagations (for chain rule), minibatch stochastic gradient descent, choice of learning rate and variants
  • Regularization: weight decay, early stopping, drop-out
  • Recurrent neural networks: embedding layers, simple RNN layer, LSTM layer, stacking recurrent layers
  • Convolutional neural networks: 2D convolutional layers, 2D max-pooling layer, local pattern property, translation-invariant property

Part 5: Active learning (10%)

Recommended supporting literature (strictly not on the reading list):

Key concepts:

  • Estimating high capacity models with little labeled data
  • Active learning
  • Uncertainty/query strategies in the context of Active Learning
  • Practical considerations for using Active Learning
  • Knowing which "little labeled data"-strategy is suitable for which data scenario
2019-04-29, Mette Langaas