ST2304 Statistisk modellering for biologer og bioteknologer, våren 2011


  • 29. juni. Har sendt inn sensurlista så sensuren vil være tilgjengelig om kort tid. Mange bra resultater. Løsningsforslag til eksamenssettet ligger her.
  • 5. juni. Gult ark utenfor instituttkontoret i 7.etg., Sentralbygg II. Hjelpemidler som på prøveeksamen.
  • 30. mai. Spørretime før eksamen: Jeg er stort sett tilgjenglig på kontoret (rom 1232, i sentralbygg II) i dagene som kommer.
  • 29. march. You will probably want to subscribe to the discussion group below even if you don't have any questions yourself. You can receive every posting by email or daily digests etc (see "Edit my membership").
  • 22. march. Some additional hints has been added to assignment 10, problem 1 and 2.
  • 16. march. No lecture on friday, march 18.
  • 24. feb. Enter data about yourself in the spreadsheet for assignment 7 asap. No lecture tomorrow friday, feb 25.
  • 15. feb. Since there are too few observations from "nordnorge", use the variable region2 (which should have the categories sørlandet, østlandet, vestlandet, midtnorge, nordnorge) instead of region.
  • 25. jan. Despite last thursday's mishap, the lectures on thursdays except in weeks 9 and 10 will be held in H1 as planned.
  • 20. jan. Updated information about how to hand in the assignments is available in the menu to the left under "øvingsgrupper".
  • 19. jan. You have first priority (in terms of space and assistance) only during the two hours on the day which you have signed up for.
  • 19. jan. 6 out of 12 assignments are obligatory but will not count in the final grades given.
  • 23. nov. Første forelesning er torsdag 13. januar, 10:15-12:00 i F6.
  • 10. jan. RSS-feed for denne websiden (og undermenyer) er tilgjengelig via det oransje ikonet (velg "current namespace") ytterst til høyre i adressefeltet i nettleser.

Plan og pensum (svært foreløpig)

1Dalgaard, kap. 1 Assignment 1 Solution
2Plotting functions and parametric curves. Linear regression, residuals, prediction and confidence bands (Dalg. 6.1-6.3). Dalg. kap 3. Multiple regression (Dalg. 11.1, 11.2, Løvås, 7.5). Dummy variables. Assignment 2 Solution
3The F-distribution. Comparison of variances (Dalg. 5.4). One- and two-way analysis of variance with balanced design (Løvås 8.3, Dalg. 7). Factors encoded as dummy variables (Dalg. 12.3)Assignment 3 Google docs data file Solution
4Linear models without balanced design, model selection ( Handout 1, Dalg. kap 11.3) Assignment 4 Fill your predictions in here And the winner is... Solution
5The multinomial distribution, contingency tables, chi-square tests (Løvås 5.9.4, 8.5, Dalg. 8, Handout 2 not including 2.2.3).Assignment 5 Google docs spreadsheet for problem 1 Solution
6Generalized linear models: Logistic regression, deviance (Dalg. 13, excluding 13.3, Handout 4). The delta method Handout 3Assignment 6 Solution
7Probit- og cloglog-link, offset-variable Handout 4 Assignment 7 Google docs spreadsheet Solution
8Linear separation (Handout 4, sect. 4). Poisson response (Dalg. 15). Overdispersion (Handout 4, sect. 6). Assignment 8 Updated solution
9Interaksjon mellom kovariater (Dalg. 12.5, 12.7.2). Litt om programmering (Dalg. 2.3) Assignment 9 Solution
10Numerical maximisation of the likelihood, asymptotic theory for approximate standard errors and likelihood ratio tests (Handout 5).Assignment 10 Solution
11Simulation based methods (Handout 5, sect. 4)Assignment 11Solution
12Power and calculation of sample size (Dalg. 9). Simulation based power calculations (Handout 5, sect 4.5)Assignment 12 Solution
13Summary of the course Trial exam Solution (updated)
14No teaching this week (biology and biotechnology on excursions)
Easter vacation!!!

* Bolk 1 foreleses i uke 2 med tilhørende øving i uke 3 o.s.v. Øvingene vil kunne endre seg forløpende t.o.m. fredag i uken før øvingen.

Final exam

9. juni. The final exam will take the usual form, that is, without the aid of a computer or R. A pocket calculator, statistical tables etc. will be allowed as permitted aids. Problems may include:

  • Interpretation of the summary of a fitted model. This may include plot of residuals etc. What is the meaning of the various parameter estimates? What assumptions does the model involve. How may the model be improved?
  • A data set may be presented with a description of the different variables and a brief description of the biological context. You should then propose a suitable statistical model (for example a generalized linear model with a certain link function) which can be used to analyse that data. This should include a rationale behind your choice of model, what assumptions the model involves etc.
  • A simple problem involving writing an expression or an R function which carries out a simple computation which demonstrates that you have understood vectorized operations, indexing, data frames, selection based on logical vectors, how standard distributions are handled (the different d-, p-, q- and r- functions), the relationship between mathematical and symbolic notation for model formulae etc.
  • Some simple mathematical derivation based on probability theory or principles for statistical inference covered here and in ST0103.

Remember that the final exam is not the objective of this course, the objective is to gain the practical experience with analysing data necessary in later research during your master degree and in order to be able to read and understand the primary literature. Some time after the last assignment (assignment 12) you will be given a "trial exam".

2018-02-12, Hallvard Norheim Bø