MA8701 General Statistical Methods
Reading list with key concepts
Part 1: Regularized linear and generalized linear models (25%)
- Hastie, Tibshirani, Wainwright: "Statistical Learning with Sparsity: The Lasso and Generalizations". The newest version of the ebook can be downloaded from the page of Trevor Hastie: https://trevorhastie.github.io/: Chapters 2.1-2.6, 2.9, 3.1-3.2, 3.7, 4.1-4.3, 4.5-4.6, 5.1, 5.4, 6.0,6.2
- Single/multi-sampling splitting part of Dezeure, Bühlmann, Meinshausen (2015). "High-Dimensional Inference: Confidence Intervals, p-Values and R-Software hdi". Statistical Science, 2015, Vol. 30, No. 4, 533–558 DOI: 10.1214/15-STS527 (focus on the single/multiple sample splitting).
Key concepts:
- Intro to lasso - Chapters 2.1-2.6, 5.1, 5.4: Linear regression, Why sparsity?, Least absolute shrinkage and selection operator (lasso) and related approaches, Fitting the model / coordinate descentfor lasso
- GLM with regularization Chapters 2.9, 3.1-3.2, 3.7, 5.4: Generalized linear models (GLM), Logistic regession with l1 (example), Fitting the model
- Generalizations of lasso – Chapter 4.1-4.3, 4.5-4.6: Elastic net, Relaxed lasso, Grouped lasso, Fused lasso, Non-convex penalties
- Inference for lasso – Chapter 6.0, 6.2 and Dezeure et al., 2015: Bootstrap method, Multi sample-splitting
Part 2: Smoothing and splines (25%)
- Friedman, Hastie and Tibshirani (2008): Elements of Statistical Learning. Chapter 5: 5.1-5,6 and Chapter 6: 6.1-6.8. Book at https://web.stanford.edu/~hastie/ElemStatLearn/
Key concepts:
- Chapter 5: Linear basis expansion, Natural cubic spline, Smoothing spline, Degrees of freedom
- Chapter 6: Kernel smoother, Local linear (and polynomial) regression, Kernel density estimation and classification, Radial basis function, Mixture model
Part 3: Experimental design in statistical learning (10%)
(to download you must be on NTNU vpn)
On reading list:
- Note on response surface methods applied to a random forest example (here BBD and CCD is given)
The following two articles is strictly not on the reading list, but illustrates the theory well:
- Article: Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study (2018), Gustavo A. Lujan-Moreno, Phillip R. Howard, Omar G. Rojas, Douglas Montgomery, Expert Systems with Applications, Volume 109, https://doi.org/10.1016/j.eswa.2018.05.024
- Article: Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies (2012), Ozan Irsoy ; Olcay Taner Yildiz ; Ethem Alpaydin, IEEE/ACM Transactions on Computational Biology and Bioinformatics ( Volume: 9 , Issue: 6 , Nov.-Dec. 2012 ) https://doi.org/10.1109/TCBB.2012.117
Key concepts
- How to optimize hyperparameters
- How to compare algorithms on the same dataset
- How to compare algorithms on several dataset
Keywords:
Screening, Steepest ascent, CCD, BBD, Canonical analysis, Cross-validation. Paired t, One-way anova, Wilcoxon signed rank test, Friedman test.
Part 4: Deep neural nets (30%)
All chapters except Chapter 8: Generative deep learning.
- François Chollet with J. J. Allaire (2018) Deep learning with R, https://www.manning.com/books/deep-learning-with-r
- François Chollet (2017) Deep learning with Python https://www.manning.com/books/deep-learning-with-python.
You choose if you read the R or Python version, both built on Keras.
Key concepts:
- Sequentially layered networks: architecture, activation functions, loss function: matched to the problem (regression, 2 or more classes in classification)
- Tensors, inner products and their use in neural nets
- Backpropagations (for chain rule), minibatch stochastic gradient descent, choice of learning rate and variants
- Regularization: weight decay, early stopping, drop-out
- Recurrent neural networks: embedding layers, simple RNN layer, LSTM layer, stacking recurrent layers
- Convolutional neural networks: 2D convolutional layers, 2D max-pooling layer, local pattern property, translation-invariant property
Part 5: Active learning (10%)
- Main article: http://burrsettles.com/pub/settles.activelearning.pdf
- Lecture slides: https://github.com/Froskekongen/MA8701/tree/master/lectures (references are not part of the curriculum)
Recommended supporting literature (strictly not on the reading list):
- Realistic Evaluation of Deep Semi-Supervised Learning Algorithms: https://arxiv.org/abs/1804.09170
- Deep Bayesian Active Learning with Image Data: https://arxiv.org/abs/1703.02910
Key concepts:
- Estimating high capacity models with little labeled data
- Active learning
- Uncertainty/query strategies in the context of Active Learning
- Practical considerations for using Active Learning
- Knowing which "little labeled data"-strategy is suitable for which data scenario