# TMA4267 Linear Statistical Models

**Messages**

**July 30, 2014 **:
The continuation exam is held at August 5, 9.00 (see studweb for where to sit).
It is a written exam (made by me - Mette). I'm on sabbatical in Perth, Australia, so John Tyssedal will be doing the rounds on August 5. I will grade the exam.
If you have questions before the exam I guess the easiest way is to contact me by email (be aware of the 6 hrs time difference - which might lead to delay in me answering the mail). A good place to start for the preparations is the May 2014 exam. I wish you the best for your preparations and for the exam.

**June 3, 2014**:
Dear students, I have finished grading the TMA4267 exam, the results are soon available at studweb (I have handed in the lists today.)

I'm very impressed with your exam papers in TMA4267. I have seen many very good solutions - and given out many good grades. But sadly there were also some Fs (and some no-show). The grade frequency for TMA4267 ended up to be:

A: 36% B: 19% C: 17 % D: 19% E: 0% F: 9%

In most courses we write a grading document, which is to be made available together with the grading: http://www.math.ntnu.no/~mettela/TMA4267/V2014/Grading.pdf Here you see in detail how the scores are given and the grade scale used.

I have also written an explanation of marks (begrunnelse) for each exam paper (for each student), but at present it is very hard to make this available to you automatically. I'm working on possible solutions. If you are in Trondheim, you may of cause stop by my office for the explanation, or contact me by email. It is also possible to request explanation via the IME-system, but that will take a long time to reach me.

Regarding the continuation exam in August: I'm in the process of writing the exam, and the exam date will be announced from the NTNU pages in July. I will leave for a one-year sabbatical in July, so if you have questions in connection with the continuation exam you may email me, and I will also find a replacement teacher in Trondheim that you may talk to - and that will do the rounds at the exam. I will grade the continuation exam.

I wish you all the best with your future studies.

**May 22, 2014**:

- Todays exam problem and tentative solutions will be available from the Exam tab (left margin) at 13.30. Directly: Exam problems.

**May 20, 2014**:

- On May 22 I will do the rounds at the exam between 10 and 11. I'm not allowed to look at or comment on your suggested solutions (what you have written) during the exam. The topic for the presence of the lecturer at the exam: are there unclear matters in the exam problems?
- Did you like TMA4267 - and perhaps also TMA4265 Stochastic processes? Then you may be interested to know that we offer many courses in statistics.
- Two of the statistics courses we offer are compulsory" if you want to become a statistician: TMA4295 Statistical Inference (autumn) and TMA4300 Computational statistics (spring).
- In addition: do you want to learn more about regression where the response is not normal? Then TMA4315 Generalized linear models (autumn) is the course for you.
- Are you interested in modelling data in time - where the correlation structure gives dependence in time? Then TMA4285 Time Series (autumn) is your subject.
- Do you have an interest in spatial problems? Climate, geophysics? Then TMA4250 Spatial statistics (spring) should be of interest.
- Medical statistics: at the Faculty of medicine we have SMED8002 Epidemiology 2, which gives a practical view of many statistical methods - among that MLR and logistic regression.
- Any questions about statistical courses? Come and talk to me (before July 5, then I leave for a one year sabbatical) or to one of the other statistician on the 10-12 floor (south west part).

** May 16, 2014**:

- The scores for the DOE project is found at Scores. The given scores range from 15 to 20. These will be added to the score on the exam (0-80, max 10 for each of 8 items).
- Exam problems not on this years readning list: I have been pointed to Exam spring 2013, where the part of problem 2a on multiple testing (Tukey) is not on this years reading list.
- Supervision in 822 at 10.15-12 on Monday 19 and Tuesday 20 May.
- Remember to pick up a yellow (stamped) A5 sheet. This sheet with handwritten notes can be brought to the exam.

** April 29, 2014**:

- Exam: The problems on stepwise (forward and backward) regression will be posed and solved differently this year compared to previous years (Part 7). But, as I have said in the lectures and I hope you have seen - we have not put very much effort on idempotent matrices and quadratic forms in this years course - that is, compared to the use of idempotent matrices in connection with ANOVA-type problems in the earlier exam problems. The 2014 exam will reflect (possibly) all parts of the course - so working with the exercises is equally important that working with earlier exam problems.
- All students that have handed in compulsory projects have been given at least a passing mark on the exercises, but due to the travel arrangements the final score for each student will not be available before the end of week 20 (Friday May 16 at the latest). I apologise for the inconvenience. A link to where you find your score will be posted here.
- Supervision May 19 and 20 at 10.15-12 in room 822, 8th floor, Sentralbygg2. You may sit here and work if you want.
- The reference group has written a final evaluation report, report. If you have additional comments or feedback please email the lecturer (Mette). The lecturer will also write a report. Mette will be on sabbatical next year, so there will be a different lecturer for TMA4267 V2015 - and it is therefore very important to report back to Mette if you have suggestions for changes to the course (things that did not work optimally), so that this may be taken into account in the planning for V2015.
- Mette will be travelling until May 12, so expect some delay in answering emails.

** April 28, 2014**:

- Due to other activities at NTNU the lectures on April 28, 10.15, and April 29, 12.15, will be in S7 - and not in S1.
- Plan for the April 29, 12.15 lecture in S7: work with August 2011 exam (the only exam where tentative solutions are missing). You find the exam here: kont426711.pdf. You may of cause ask questions at the lecture - or by sending the lecturer an email.
- If you have feedback for the report from the reference group please contact Chanette or Tormod, see details in the "Course information" tab to the left.
- Suggestion for activities before the exam: supervision on Monday May 19 and Tuesday May 20 at 10.15-12?

**April 23, 2014**:
The plan for the session on Monday April 28 at 10.15-12 in S7 is the following:

- Some key concepts in the course.
- The 2014 reading list (pensumliste) - and compare to previous years.
- Exam practicals.
- Exam questions.
- Activities before the exam: supervision? when?

We may also meet on Tuesday April 29 at 12.15 in S7 if you would like that - then to work through an exam from an earlier year.

**March 25, 2014**: Today was the last lecture with new material in TMA4267. There will be no more lectures before Easter. Tomorrow I expect you work with the DOE compulsory project, and if you have questions Mette can be contacted (in her office). The final exercise, Exercise 7, will be available tomorrow. Next week the lectures are cancelled due to excursions for the 3rd year Ind Mat students. I offer lectures Monday April 28 and Tuesday April 29 ( both days in S7) with topic: summing up, exam quesitons++. Until then: remember to hand in the DOE project on paper in the mail box of Mette Langaas on the 7th floor of Sentralbygg 2 - and write your candidate number to identify you. Then, I wish you a Happy Easter, and hope to see you April 28.

**March 17, 2014**: Tomorrow we start with the last part of the course: Part 7: Model selection, regularization and dimension reduction. Up to now we have looked at classical statistical techniques - developed many decades ago. Now we move to the research area of today! We meet both theoretical and practical challenges - tomorrow mainly practical ones:-)
For this part we use book chapters 6 and 10.2 from the new book "Introduction to statistical learning", and tomorrow chapter 6.1 is the topic. You find links to download chapter 6 in our lecture table (see lectures in the menu to the right).
Also look at the rightmost column of the lecture table. There I have linked 4 videos from youtube - where the book authors lecture chapter 6.1. The slides they are using you also find in the lecture table. Tomorrow I will do my version of the 4 videos - the advantage you may get if you come to class is that you may see me relate the topic to what we previously have learned in our course- and you may of cause ask questions. I will also include questions - as usual. I think it would be a good investment to watch the videoes (either before or after the lecture tomorrow).

**March 16, 2014:** Minutes from the second meeting with the reference group.

**March 10, 2014:** Friday March 14 at 11.15 is the second meeting with the reference group. Specific topics:
1) Issues around the compulsory project - supervision, report, grading.
2) The last few weeks of TMA4267: plan.
3) No lecture in week 15. Activities after Easter= two lectures with focus on main topics for each part of the course? Quiz questions for each part? Supervision before the exam (when?).The exam is May 22.
4) Other business (Eventuelt).

** March 9, 2014:** The topic this week is Design of Experiments, and the teaching material is based on a note - see Lectures tab. I have also made a page describing the compulsory project - maybe you have a look and see if everything is clear: New tab called "Project" in the left menu.
Exercise 6 (on DOE) will be available tomorrow.
We only have a few topics more to cover in our course, and only 8 more lectures with new stuff.

** Feb 27, 2014**:
Exercise 5 is out. See the exercise tab. Supervision on March 5 and 12.

** Feb 25, 2014**:
Extra R-session lecture/supervision: Wednesday March 5 at 12.15-13.00 in S4. Bring your laptop with R and Rstudio (last is optional) installed. You should also have access to internet. What will happen?

- Lecturer will run R on projector, write commands, give tasks to students, and explain concepts.
- Students: will run R, work together, ask and answer questions.
- We will use one or two data sets (may be available before the class), and analyse data with predefined R functions.
- We will also look at simulation and writing our own functions.
- If you have specific questions please write them here before the class (if you want the lecturer to prepare an answer), or during class.

** Feb 21, 2014**:
1) Exercises: solutions are now posted for Exercise 4.
2) On Monday we start with a quiz from 4.5 and ch 3. Please report to Mette if you find typos or errors.
3) Then we continue with sums of squares, and if you have looked at Exercise 4, Problem 4 - I think you will understand better the lecture.
4) Wish you all a great weekend!
You will get 10 minutes in the end of the first lecture on Tuesday March 11 (12.50-13) to discuss this in class without the lecturer present.

**Feb 11, 2014**
Todays lecture: the students attending the lecture fully understood on average 76.7 % of the lecture, with a median of 80% and a standard deviation of 19 percent points. Further 25 wanted quiz/Kahoot! for summing up after a chunk of lectures, and 14 wanted mindmap drawn by the lecturer. You may comment on this:-)

**Feb 10, 2014**:

- Did you miss todays lecture? We finished part 3 and played a quiz - you may now try the quiz yourself. Before you do so don't read the last two slides. The last two slides give the answers and some statistics about todays game. Beware of questions 3 and 7 - those had the lowest correct rate… http://www.math.ntnu.no/~mettela/TMA4267/V2014/Ch4quiz.pdf You may also want to try out Kahoot? The address of the quiz is http://goo.gl/bSrl4A
- Previously, only 2/29 answered that they had read the book before the lecture, I would challenge you to do the following today (or sometime before the lecture at 12.15 tomorrow): Open the book, Chapter 3, and spend max 5 minutes to browse through pages 61-69. Look for clues to the following three short questions: “What is p and n”? “The design matrix X - what is it?” “That does the normal equations look like?”. Link to ch3 http://link.springer.com/chapter/10.1007/978-1-84882-969-5_3

**Feb 9, 2014**:
Exercise 3 with R scripts and solutions are available under the Exercise tab. Come to the supervision on Wednesday at 10.15 - or contact the TA or lecturer - if you have questions.

**Feb 4, 2014** Referat fra første møte i referansegruppa.

**Feb 2, 2014** Sorry about this, but I have now decided that we go through general theory on random vectors and matrices, E and Cov of these, and the the multivariate normal distribution (sections 4.3-4.5) before we start to look at multiple linear regression in chapter 3. The reason for this is that the multivariate normal distribution is needed to prove important concepts in chapter 3.
Again, I hope that this has not given you too much trouble - for those of you that have read 3.1-3.2 - we will soon get there…

**Jan 31, 2014**:
Since only 2/29 answered that they had read the book before the lecture, I would challenge you to do the following today (or sometime before the Monday lecture at 10.15): Open the book, Chapter 3, and spend max 5 minutes to browse through pages 61-69. Look for clues to the following three short questions: "What is p and n"? "The design matrix X - what is it?" "That does the normal equations look like?". Link to ch3: Bingham&Fry Ch3.

**Jan 23, 2014**: Exercise 2 is now available: Exercises tab
Solutions and R code is also available, supervision is on Wednesday January 29.

**Jan 21, 2014**: Agenda for the first meeting with the reference group. Time: Friday January 31 at 11.15 in room 1236, 12 etg, sentralbygg 2.
There is still room for additional members of the reference group.

- The course: course plan, mathematical level, theory vs practice, understanding.
- The book: notation, theory, examples.
- The lectures: mathematical level, tempo, slides, class notes, interaction, …
- The exercises: theory, R, supervision.
- Other topics?

**Jan 13, 2014:**

- Todays 4 figures: Correct answers were: lillac, white, red, black (in this order). Observe: the figures with sigma_Y=2 are much wider than with sigma_Y=1, and thus not so tall. These are the white and black figures. The red and lillac were much taller, and had sigma_Y=1. The lillac and white figures had rho=0, which could be seen by observing the ellipse axes are in the x- and y- direction (along the quadratic or rectangular bases of the figures). The red figure had axes 45 deg with the x- and y-axis, meaning that sigma_X=sigma_Y (see Exercise 1, problem 2). The axes of the black figure were not 45 deg with either x- or y-axis, meaning that sigma_X is not equal to sigma_Y.

- Supervision of exercises will be Wednesdays 10-15-11 in room 734, 7th floor, Sentralbygg 2. Petter will be present to answer any questions. First time this week.
- Office hrs for Mette will be (starting today) Mondays at 12-13 (after the lectures).

**Jan 10, 2014:**

- Do we need a FB page? Like the TMA4267 page and find out: TMA4267 on Facebook.
- We need minimum three students for the reference group of TMA4267 - that should be one from BMAT, one or two from MTFYMA and one from others (Erasmus, other study programmes). Please email Mette (see course info tab for email) if you will join the reference group. There will be 3 meetings, and a (short) closing report must be filed. At the end of the course the lecturers report on the course ("emnerapport"), with the closing report, will be sent electronically to all course participants.
- The main textbook we use is the Bingham and Fry (2010) Regression book, but we will also take one chapter (chapter 6) from the book by Witten et al (2013) Introduction to Statistical Learning. Starting on January 21 there will be a MOOC at Stanford with this book: MOOC. If any of you (taking TMA4267) choose to participate in that MOOC I will very much like to get to know about this - and on progress in the MOOC.

**Jan 07, 2014:**

- You have until Sunday January 12 to answer these questions. On Monday morning I will select the time for exercise supervision as the time slot with the highest vote, and use the second runner up as time for my office hours (I will sit in Matteland, so you may easily ask for help). For the supervision next week Petter will do the ordinary supervision, and I will be available for an R session for those of you who will focus on starting to use R (bring your laptop with R and R studio installed). More information to come.
- Todays questions: The correct answer for question 2 was sqrt(Var(Y)/Var(X)). Here you find a histogram of the answer to question 1 "How many percent of the lecture did you fully understand", observe that the mean is 72.8%. For the third question most liked either the bivariate f(x,y) or the Galton regression to mean result the best.
- The first exercise is out, see the exercise-tab. You should be able to answer all questions of the exercise by now. If you have questions this week, just come to the office of Petter (1026) or Mette (1236) in Sentralbygg 2. The plan is that you work with this exercise this week (week 2) and the next week (week 3).

**Jan 03, 2014:** Please spend a few minutes answering these questions. The answers will be important for us as feedback to the course, and the anonymous data collected will be used in the lectures.

**Dec 30, 2013:** Welcome to the www-page for TMA4267 Linear statistical models, spring 2014. The first lecture will be January 6, in room S4, at 10.15. I will give an introduction to the course, and we will try to find a slot for the exercises (currently in conflict with lectures in TMA4212). Then, we start on the topics for the course. New: this year we use a new textbook, see "Reading list" in the meny to the left.