TMA4315 Generalized linear (mixed) models 2021

Messages

Sept. 8: You can get assistance with the projects at Banachrommet, each Friday at 11-12.

Sept. 7: If you're in search of someone to collaborate with on project 1, send me an email and I will if possible suggest someone you can work with.

Sept. 7: If you can't a lectures, then you can find lecture notes and videos from last year at here (a bit out of sync with this years lectures).

Aug. 26: See updated time for second weekly lecture below.

Sept. 6: Project 1 is out. Deadline is Friday October 1.

Practical information

Lectures: Thursdays 10:15-12:00 in EL6 and Tuesdays 8:15-10:00 in R D4-132Fridays 12:15-14:00 in S1 (but we will try to move the lectures on fridays to another time to avoid colliding with TMA4295 Statistical Inference). For the time being, we plan to do physical lectures only.

Guidance with exercises and projects: Fridays 14-15 in H3 424 Vembi11-12 in * from week 35 and via the Discourse forum.

We plan to do physical lectures only but the lecture notes are available here and videos from last year can be found here

Lecturer: Jarle Tufto

Teaching assistant: Silius M. Vandeskog.

Reference group: NN, NN, NN (send me an email if you want to be in the reference group).

Obligatory projects

There will be three obligatory projects (counts 30% of final grade).

Project 1

The problems are posted in the Discourse forum. You may ask questions or post your attempted solution (if you want feedback) by replying to each exercise.

Tentative curriculum

Fahrmeir et. al. (2013) (freely available on springer link), ch. 2.1-2.4, B.4, 5.1-5.4, 5.8.2, 6, 7.1-7.3, 7.5, 7.7. We will also use some material from Wood (2015) Wood (2015), and some material from some Harville 1974, Agresti 2002 and Kristiansen 2016 (see below).

This covers ordinary linear and multiple regression (mostly repetition from TMA4267 Linear statistical models), binary regression, Poisson and gamma regression, the exponential family and generalised linear models in general, categorical regression (includes contingency tables and log-linear models, multinomial and ordinal regression), linear mixed effects models, generalized linear mixed effects models.

Also see the official ntnu course info.

Lectures

R code from the lectures (markdown) (2019 version)

August 26: Introduction to glms (ch. 2.1-2.3), the exponential family (ch. 5.4.1). .

August 31: More on the exponential family (ch. 5.4.1). Review of theory of linear models (ch. 3).

September 2: Geometric views of the linear model. Sampling distributions associated with the linear model (the chi-square-, t- and F-distribution).

September 7: Testing and fitting linear hypotheses (via quadratic form for \(C\hat\beta-d\) - Box 3.13 in Fahrmeir) or via F-test based on sums of squares for each model alternative (the restricted model fitted via Lagrange method (pp. 172-173) or using the solution to problem 2 below). Design matrices for interactions between numeric and categorical covariates. Binary regression (ch. 5). Logit, probit links

September 9: cloglog models. Binary regression continued. Score function of binary regression model. Some general properties of the expected log likelihood (sec. 4.1 in Wood (2015)). Expected and observed Fisher information and iterative computation of MLEs for binary regression (Fisher scoring algorithm). Binary regression continued.

September 14: A minimal example of divergence of the Fisher scoring algorithm. Asymptotic properties of MLEs. Likelihood ratio, Wald, and score tests. Deviance and testing goodness-of-fit.

September 16: More on the deviance and the saturated model. Deviance residuals. Estimating the overdispersion parameter. We'll also go through a sketch of a proof for the asymptotic distribution of LR test statistic (section 4.4 in Wood).

September 21: Example (lung cancer rates) illustrating model selection via AIC, model parsimony, Wald and likelihood ratio testing. Theory behind AIC (Wood sec. 4.6).

September 23: Poisson regression. Fisher scoring vs. Newton-Raphson for poisson-regression with non-canonical identity link (see R code for further illustrations). This paper provides an example of non-canonical link functions.

Exam

Tuesday December 7, 15:00-19:00

Previous exams can be found at previous exams out of which 2013-2016 are not the most relevant.

Any questions on the exponential family in this years exam will be based on the notation in Fahrmeir. The solutions to exams in 2008 and 2009 is based on a slightly more general definition of the exponential family and different notation (Dobson & Barnett 2008) writing the pdf of pmf as as \(\exp(b(\theta)a(y) + c(\theta) + d(y))\) instead of \((\exp((y\theta - b(\theta))w/\phi - c(y,\phi,w)\) in Fahrmeir. Thus

  • \(b(\theta)\) is the canonical parameter corresponding to \(\theta\) in Fahrmeir
  • \(a(y)\) corresponds to \(y\),
  • \(c(\theta)\) (implicitly a function of the canonical parameter \(b(\theta)\)) corresponds to \(b(\theta)\). Implicit differentiation then leads to a different formula for \(EY=-c'(\theta)/b'(\theta)\).
  • \(d(y)\) corresponds to \(c(y,\phi,w)\).

Note that if a random variable \(Y\) satisfies the definition of Dobson & Barnett, then \(a(Y)\) satisfies the definition in Fahrmeir.

2021-09-16, Jarle Tufto