Assignments

* Week 5: Page 39-40: Ex. 2.1, 2.2, 2.3, 2.5 (optional - you may obtain the results in a way different from the one sketched and hinted - use your knowledge of multivariate analysis), 2.7 (Hint: You will need formulas \( E_{X,Y}=E_XE_{Y|X} \) and \(Var_{X,Y} = E_XVar_{Y|X} + Var_X E_{Y|X} \), see Casella & Berger: Statistical Inference), 2.8 (This is an R-exercise. Note that the data are found from the book website http://www-stat.stanford.edu/~tibs/ElemStatLearn/. You may use the function 'knn' in R.

* Week 6: Ex. 3.2 (you may drop the simulation, but you are encouraged to do it). Ex. 3.3 (a). Ex. 3.6 (see also p. 64, second last paragraph). Extra exercises: Prove that (3.44) really minimizes (3.43). In (2.24), compute instead the expected value of the distance X from the center of the ball (origo) to the nearest point. Evaluate the expression for p=10, N=500. (Hint: Show first that the expected value can be written as \( \int_0^1 P(X>x) dx = \int_0^1 (1-x^p)^N dx \) ).

* Week 7: Page 95-97: Ex. 3.12, 3.13, 3.14, 3.30. Extra exercise: Redo the analysis in Table 3.3 for PCR and PLS, but with varying the number M of components. Discuss the differences in results when using different numbers of components. Hints: Some code to use for this is given here (from course STK4030, fall 2008,Univ of Oslo).

* Week 8: Pages 135-137: You are recommended to do Exercise 3.30 from Week 7. Further exercises are: Ex. 4.1 (Hints and extension: Go through the page Maximization Result (copied from Johnson & Wichern: Applied Multivariate Statistical Analysis). First show how this is used in principal component regression, Then show how it can be used to solve the generalized eigenvalue problem of Fisher, leading to the canonical variates, p. 116). Ex. 4.2. (You need not do (d)). Ex. 4.5. Ex. 4.9. (You may use the function mahalanobis in R).

* Week 9: From last week: Ex. 4.2 and 4.5. You are recommended to work on 4.5 in particular. Then pages 181-185. Ex. 5.1, Ex. 5.4, Ex. 5.7 (If you do not solve (a), you may still solve (b) by using the result in (a), and you may solve c) by using the result in (b)). Ex. 5.13 Download hint

* Week 10: Pages 216-218: Ex. 6.1, Ex. 6.2 (first part was done in class), Ex. 6.3 (Challenge?), Ex. 6.5. Extra: Find out what is done by the R-function 'loess'. Use it with data you simulate yourself from the model used in Figure 6.1. Extra exercise on df.

* Week 11: Pages 257-259: Ex. 7.2 [Hint: For both (7.62) and (7.63) you may look separately at f(x0) > 1/2 and f(x0) < 1/2. Also be aware that (X,Y) is independent of the training set which determines f-hat], Ex. 7.7, Ex. 7.10. Extra 1: Derive the expression (7.12) on page 224. Extra 2: Compare the expression (7.26) for Cp (page 230) with the expression for Cp used in the book by Walpole, Myers and Ye (used in the basic statistics courses at NTNU).

* Week 15: Brief discussion of trial exam. Exercise: In the ordinary regression situation in Chapter 3.2, discuss different methods for doing bootstrapping (nonparametric, parametric, bootstrapping residuals etc.) What could be the reasons for doing bootstrapping? (Estimation of variance, bias or prediction error, etc.)

* Week 16: Download exercise on Estimating Equations. Exercise on EM algorithm: Downlad the article Exercises in EM and go through "The First Exercise". Try to solve it yourself after having checked the arguments behind (1) and (2).

* Week 17: Exercises 10.40 and 10.43 in Chapter on Stein's paradox

Some solutions

Usually there will not be distributed solutions to exercises that are discussed at the exercise meetings (Tuesdays 13.15-14). Some solutions will though be given here, for example in cases where there was not time to go through them in class.

2013-04-22, Bo Henry Lindqvist