## Assignments

* Week 4: ELEMENTS page 39-40: Ex. 2.1, 2.2, 2.3, 2.5 (optional - you may obtain the results in a way different from the one sketched and hinted - use your knowledge of multivariate analysis), 2.7 (Hint: You will need formulas $E_{X,Y}=E_XE_{Y|X}$ and $Var_{X,Y} = E_XVar_{Y|X} + Var_X E_{Y|X}$, see Casella & Berger: Statistical Inference), 2.8 (This is an R-exercise. Note that the data are found from the book website http://www-stat.stanford.edu/~tibs/ElemStatLearn/. You may use the function 'knn' in R. INTRODUCTION page 53: Ex. 7.

* Week 5: Ex. 3.2 (you may drop the simulation, but you are encouraged to do it). Ex. 3.3 (a). Ex. 3.6 (see also p. 64, second last paragraph). Extra exercises: Prove that (3.44) really minimizes (3.43). In (2.24), compute instead the expected value of the distance X from the center of the ball (origo) to the nearest point. Evaluate the expression for p=10, N=500. (Hint: Show first that the expected value can be written as $\int_0^1 P(X>x) dx = \int_0^1 (1-x^p)^N dx$ ).

* Week 6: Page 95-97: Ex. 3.12, 3.13, 3.14, 3.30. Extra exercise: Redo the analysis in Table 3.3 for PCR and PLS, but with varying the number M of components. Discuss the differences in results when using different numbers of components. Hints: Some code to use for this is given here (from course STK4030, fall 2008,Univ of Oslo).

* Week 7: Pages 135-137: Ex. 4.1 (Hints and extension: Go through the page Maximization Result (copied from Johnson & Wichern: Applied Multivariate Statistical Analysis). First show how this is used in principal component regression, Then show how it can be used to solve the generalized eigenvalue problem of Fisher, leading to the canonical variates, p. 116). Ex. 4.2. (You need not do (d)). Ex. 4.5. Ex. 4.9. (You may use the function mahalanobis in R).

* Week 8: Trial exam. You may also consider exercises on pages 181-185. Ex. 5.1, Ex. 5.7 (Recommended! If you do not solve (a), you may still solve (b) by using the result in (a), and you may solve c) by using the result in (b)). Ex. 5.13 Download hint

* Week 9: Pages 216-218: Ex. 6.1, Ex. 6.2, Ex. 6.3 (Challenge?), Ex. 6.5. Extra: Find out what is done by the R-function 'loess'. Use it with data you simulate yourself from the model used in Figure 6.1. Extra exercise on df.

* Week 10: Pages 257-259: Ex. 7.2 [Hint: For both (7.62) and (7.63) you may look separately at f(x0) > 1/2 and f(x0) < 1/2. Also be aware that (X,Y) is independent of the training set which determines f-hat], Ex. 7.7, Ex. 7.10. Extra 1: Derive the expression (7.12) on page 224. Extra 2: Compare the expression (7.26) for Cp (page 230) with the expression for Cp used in the book by Walpole, Myers and Ye (used in the basic statistics courses at NTNU).

* Week 11: INTRODUCTION TO, pages 332-333, Exercises 8.4: 1, 4, 5, 6.

* Week 12: Exercises from last week, plus Page 336 (in Elements): Ex. 9.5 (a) and (e).

* Week 13: INTRODUCTION TO, pages 368-369, Exercise 3. ELEMENTS page 455: Ex. 12.1. Extra exercise: (a) On p. 307 in ELEMENTS is given the recipe for obtaining a split of a regression tree. A corresponsing recipe for classification trees is not, however, completely given in Section 9.2.3. Write down in detail such a recipe for classification trees. What are now the possible versions of the criterion (9.13)? (b) Step 2(a) of Algorithm 10.1 on page 339 in ELEMENTS asks for a classifier obtained from the training data using weights w_i. How would you modify your recipe of (a) above to accomodate this?

* Week 14: INTRODUCTION TO, Section 8.3.4 (p. 330) and Section 8.3.2 (p. 327). Do the described R-sessions.

## Some solutions

Usually there will not be distributed solutions to exercises that are discussed at the exercise meetings (Thursdays 09.15-10). Some solutions will though be given here, for example in cases where there was not time to go through them in class.