TMA4180 Optimization I, Spring 2018

The course description, including the learning objectives, schedule, and exam info can be found here.

Messages

• 08.06: Here are possible solutions to the today's exam. Please let me know if you find any mistakes or if you have any questions.
• 04.06: I will be available for Q&A on 06.06 between 15-17 and 07.06 between 08-10 through a videoconferencing setup. This (unfortunately) requires installing a small client or browser plugin for Vidyo: I had success with my Android phone and Safari under OSX (but not Chrome under OSX; Firefox seems to be unsupported), but the company claims that they are quite cross-platform etc. I will publish the link to the conference on BlackBoard to avoid getting calls from strangers. I have never tried this particular setup, or videoconferencing with many people, so let us see how this works out.
• 17.05: I have finally looked through your submissions of project 2, and with very few exceptions they are very good. I have scanned my handwritten notes and put them here. Please email me if you have any questions.
• 14.05: I have booked F6 on 28.05 for Q&A from 13:15.
• 03.05: There will be a Q&A possibility on 28.05 tentatively 13-15. I will book an auditorium and provide more details later. I will also strive for setting up another possibility closer to the exam, around 06.06.
• 02.05: Here is the curriculum with topics as they are ordered in the textbook. This is not the order in which they have been introduced in this course (e.g. merit functions are discussed in SQP context; barrier methods were discussed together with penalty methods, etc)
• Introduction: Chapter 1
• Existence of solutions: Section 2.1, note "Minimisers of Optimisation Problems" [https://wiki.math.ntnu.no/_media/tma4180/2017v/note1.pdf]
• Optimality conditions, unconstrained case: Section 2.1
• Convex sets and functions: note "Basic Properties of Convex Functions" [https://wiki.math.ntnu.no/_media/tma4180/2017v/note2.pdf]
• Line search methods: Sections 3.1-3.3, 3.5. Note: [https://wiki.math.ntnu.no/_media/tma4180/2018v/note03.pdf]
• Trust region methods: Sections 4.1, 4.3 (mostly overview)
• CG, linear/non-linear: Chapter 5
• Quasi-Newton methods: Sections 6.1, 6.2, 6.4-BFGS only
• Large scale problems: Section 7.1 (Inexact Newton, local convergence, linesearch Newton-CG), 7.2 (Limited memory BFGS)
• Derivative free methods: Section 9.5
• Least squares problems: Sections 10.1–10.3 (Gauss-Newton+convergence)
• Theory of constrained optimization: Sections 12,12.1-12.5(Second order conditions), 12.6 (other constraint qualifications). This note [https://wiki.math.ntnu.no/_media/tma4180/2018v/note12.pdf]
• Linear programming: Section 13, 13.1, 13.2 and this note [https://wiki.math.ntnu.no/_media/tma4180/2018v/note16.pdf]
• Interior point methods for LP: 14.1-14.2
• Fundamentals of algorithms for non-linear constrained optimization: 15.4 (merit functions), 15.5
• Quadratic programming: 16.1, 16.4, 16.5
• Penalty and augmented Lagrangian: 17.1-17.3
• SQP: 18.1-18.4 (overview)
• Barrier method for non-linear programming: 19.6
• 25.04:
• Maybe it was not clear from the previous message: you are welcome to stop by my office if you want to get feedback for project 1.
• The barrier method for project 2 is described in Section 19.6 in N&W. When you solve the barrier subproblem using a linesearch type algorithm, check that the step you are about to take is "strictly" feasible (otherwise the barrier function will involve logarithms of 0/negative numbers, and who knows what this results in). This also means that the steplengths may be bounded by the proximity to the boundary, and perhaps too short for a Wolfe-type curvature condition to hold. This of course has implications for CG or quasi-Newton type methods: it may be necessary to reset the method back to steepest descent if the search direction is too bad (CG), or skip the updates (quasi-Newton). Please document the decisions you take in your report. It is much better to have a single well documented and tested method than several half-baked ones - less is more.
• I will be out of office during week 18. Please contact Ola (or me by email) if you have questions/need help with project 2.
• 17.04:
• I have now (finally) gone through your hand ins of project 1. The quality of the answers to questions 1 is very diverse. The rest of the questions have mostly been answered. With respect to the implementations: many of you used non-linear CG, such as FR method, with a linesearch which does not guarantee strong Wolfe conditions. As a result the method does not necessarily generate descent directions. You should have at least checked for this and reset the direction to the steepest descent direction in this case. A few of you have used Gauss-Newton method, which is OK as long as one understands that the Jacobi matrix is not defined in all points and is discontinuous. However it is not OK to use inv() to solve the normal equations!!! Use Cholesky, or CG or something similar. Please feel free to pop in and discuss the results, maybe I have misunderstood/overlooked something.
• I will not be readily available for help with project 2 during the week 18 (i.e. the last week before the deadline). Please plan accordingly.
• 09.04:
• There will be no lectures weeks 15-16. I will be available in F6/F2 during the usual times for help with the project. I have also pre-allocated the slot on 23.04 for Q&A.
• I have also written a couple of explanations to the project 2 here
• 02.04: Project 2 has been published.
• 22.02:
• I am available for questions in 1038/SBII from 10-12 and 13:15-??. I guess I have forgotten to write on 23.02, but oh, well…
• Regarding the finite difference check in the project: I am not thinking of anything fancy. Here is an example
import numpy as np

def f(x):
# f = sin(x0) + exp(x1*x2*...)
x0 = x[0]
x1 = x[1:]
return np.sin(x0)+np.exp(np.prod(x1))

def df(x):
# grad f = [cos(x0)], exp(x1*x2*...)*x2*x3*..., ...]
g = np.zeros_like(x)
x0 = x[0]
x1 = x[1:]
g[0] = np.cos(x0)
for i in range(0,len(x1)):
g[i+1] = np.exp(np.prod(x1)) * np.prod(x1[np.arange(len(x1)) != i])
return g

if __name__=='__main__':
N = 10
# generate random point and direction
x = np.random.randn(N)
p = np.random.randn(N)
f0= f(x)
g = df(x).dot(p)
# compare directional derivative with finite differences
for ep in 10.0**np.arange(-1,-13,-1):
g_app = (f(x+ep*p)-f0)/ep
error = abs(g_app-g)/abs(g)
print('ep = %e, error = %e' % (ep,error))

which produces output like

ep = 1.000000e-01, error = 2.352167e-02
ep = 1.000000e-02, error = 2.300484e-03
ep = 1.000000e-03, error = 2.294075e-04
ep = 1.000000e-04, error = 2.293424e-05
ep = 1.000000e-05, error = 2.293362e-06
ep = 1.000000e-06, error = 2.301938e-07
ep = 1.000000e-07, error = 1.813962e-08
ep = 1.000000e-08, error = 2.890379e-08
ep = 1.000000e-09, error = 9.976792e-07
ep = 1.000000e-10, error = 2.231572e-06
ep = 1.000000e-11, error = 1.161743e-04
ep = 1.000000e-12, error = 8.532599e-06
• 20.02: These days, a national student survey (Studentenes Helse- og Trivselsundersøkelse 2018) is taking place. In total, 160.000 Norwegian students receive this survey, which focuses on psychosocial issues. Please participate! Link to the survey: https://studenthelse.no/heltaerlig
• 15.02: Office hours of course assistant, Ola Mæhlen; Thursdays 14:00-15:00, 1056/SBII.
• 12.02: Project 1 has been published.
• 01.02: Someone asked the question today regarding the behaviour of Fletcher-Reeves method under the assumption that $\nabla f(x_k)$ and $p_k$ are nearly orthogonal - I claimed that $\|\nabla f(x_k)\|$ is then much smaller than $\|p_k\|$. This is basically the consequence of Lemma 5.6 and is explained in equations (5.56)-(5.57) in the book.
• 15.01: Markus Grasmair will give a lecture on convex sets/functions today.
• 08.01: Welcome to the course! There are no exercises today, 08.01. The first exercise session happens 15.01. I am going to need three volunteers for the reference group (you can read about it here). Please write me an email if you would like to affect the development of this course.