ST2304 Statistisk modellering for biologer og bioteknologer, våren 2012

Messages

  • June 21. Constructive critisism, comments and suggestions can be posted on the discussion group. If you want to remain anonymous please login in as user anonym2304@gmail.com with password anonym2304.
  • June 21. The grades on the exam.
  • June 4. A preliminary solution to todays exam is available via the link "Old exams".
  • March 27. The lecture on April 19 is moved to R7 (see calendar below).
  • March 23. We move the deadline for assignment 10 to Friday, April 13.
  • 21. mars. Verzani p. 100-104 contains explanations of functions, for-loops and if-statements.
  • 14. mars. Calender for the remaining computer exercise classes. Feel free to switch to another group if the timetable doesn't suit you or ask for assistance in the the google discussion group. Don't skip assignment 10 which includes lots of important material.
  • 14. mars. Some of the material covered in the last lecture (Newton's method) can be found in Neuhauser and here
  • 6. mars. Neste ekstraforelesning f.o.m 15. mars.
  • 27. febr. På bagrunn av ønske fra studentene er det satt opp ekstra undervisning med fokus på bruk og tolkning av resultater i R. Torsdager 12.15 - 13.00 Sted: S 8. Første gang neste torsdag, 1. mars. Send forslag til hva du vil ha gjennomgått til Thomas Kvalnes.
  • 23. febr. If you're stuck and need help outside the assignment hours, you are very welcome to post a question at this google group. Someone out there will answer your question. Also feel free to answer questions posted by someone else - explaining things to someone else is an excellent way of clearifying your own thinking. You will also want to sign up to the group to if you want to receive all postings by email. Note that the email address you sign up with will be visible through your profile so if you want to stay completely anonymous you should sign up to the group using an email adresse which don't give away your identity.. Last years discussion group is archived here.
  • 16. mars. Referansegruppe er Rønnaug Steen Kolve (biologi), Vegard P. Sollien (biologi), Hanna Kjelstrup (biologi) og Abba Elisabeth Coron (bioteknologi). Gi besjed til referansegruppa dersom du har synspunkter på faget. Første møte er torsdag 23. februar etter forelesningen.
  • Feb. 9: You must pass 6 out of 12 assignments to be admitted to the final exam.
  • Jan. 25: When handing in your report on each assignment the report must be in the form of a single word or pdf-file and must (of course) include your name and your email address. The report should be clearly written in norwegian or english and formatted so that it is intelligible to your peers (use complete sentences). The R-code you have used should also be included.
  • Jan. 23: 6 out of 12 assignments must be approved.
  • Jan. 16: All computer lab groups are moved from R52 to Vegas.
  • Jan. 6: Sign up for one of the computer lab groups by following the link "Schedule" to the left.
  • The first lecture is tuesday januar 10, 12:15-14:00 in H1.
  • RSS-feed for denne websiden (og undermenyer) er tilgjengelig via det oransje ikonet (velg "current namespace") ytterst til høyre i adressefeltet i nettleser.

Plan and course content (preliminary)

Part*ThemeAssignment
1Dalgaard, kap. 1 Assignment 1 Norsk versjon Solution
2Plotting functions and parametric curves. Linear regression, residuals, prediction and confidence bands (Dalg. 6.1-6.3). Dalg. kap 3. Multiple regression (Dalg. 11.1, 11.2, Løvås, 7.5). Dummy variables. Assignment 2 Solution
3The F-distribution. Comparison of variances (Dalg. 5.4). One- and two-way analysis of variance with balanced design (Løvås 8.3, Dalg. 7). Factors encoded as dummy variables (Dalg. 12.3)Assignment 3 Google docs data file Solution
4Linear models without balanced design, model selection ( Handout 1, Dalg. kap 11.3) Assignment 4 Fill your predictions in here And the winner is... Solution
5The multinomial distribution, contingency tables, chi-square tests (Løvås 5.9.4, 8.5, Dalg. 8, Handout 2 not including 2.2.3).Assignment 5 Google docs spreadsheet for problem 1 Solution
6Generalized linear models: Logistic regression, deviance (Dalg. 13, excluding 13.3, Handout 4). The delta method Handout 3Assignment 6 Solution
7Probit- og cloglog-link, offset-variables Handout 4 Assignment 7 Google docs spreadsheet Solution
8Linear separation (Handout 4, sect. 4). Poisson response (Dalg. 1515.2). Overdispersion (Handout 4, sect. 6). Assignment 8 Updated solution
9Interaksjon mellom kovariater (Dalg. 12.5, 12.7.2). Litt om programmering (Dalg. 2.3) Assignment 9 Solution
10Numerical maximisation of the likelihood, asymptotic theory for approximate standard errors and likelihood ratio tests (Handout 5).Assignment 10 Solution
Excurions and easter-vacation (in week numbers 13 og 14)
11 (week 15)Simulation based methods (Handout 5, sect. 4)Assignment 11Solution
12 (week 16)Power and calculation of sample size (Dalg. 9). Simulation based power calculations (Handout 5, sect 4.5)Assignment 12 Solution
13 (week 17) Summary of the course

* The lectures for part 1 is in week number 2 with the associated assignment in week 2/3 and so on. Each assignment should be handed in by friday at 12:00 (in week 3 for assignment 1 and so on) by email according to this table.

Final exam

4. juni. The final exam will take the usual form, that is, without the aid of a computer or R. Permitted aids are a pocket calculator, Tabeller og formler i statistikk (Tapir forlag), Matematisk formelsamling (Rottmann), one handwritten yellow a4-sheet. Problems may include:

  • Interpretation of the summary of a fitted model. This may include plot of residuals etc. What is the meaning of the various parameter estimates? What assumptions does the model involve. How may the model be improved?
  • A data set may be presented with a description of the different variables and a brief description of the biological context. You should then propose a suitable statistical model (for example a generalized linear model with a certain link function) which can be used to analyse that data. This should include a rationale behind your choice of model, what assumptions the model involves etc.
  • A simple problem involving writing an expression or an R function which carries out a simple computation which demonstrates that you have understood vectorized operations, indexing, data frames, selection based on logical vectors, how standard distributions are handled (the different d-, p-, q- and r- functions), the relationship between mathematical and symbolic notation for model formulae etc.
  • Some simple mathematical derivation based on probability theory or principles for statistical inference covered here and in ST0103.

Remember that the final exam is not the objective of this course, the objective is to gain the practical experience with analysing data necessary in later research during your master degree and in order to be able to read and understand the primary literature. Some time after the last assignment (assignment 12) you will may be given a "trial exam"…

2013-01-11, Jarle Tufto