Adjoint models
Contents 
Introduction
In an introductory presentation I talk about my background, and how we use adjoint models in oceanography, in particular within the Estimating the Circulation and Climate of the Ocean (ECCO) project (with an aside on sealevel) to improve our understanding of the global ocean circulation (yet another heavily undersampled system). For a recent semipopular overview of ECCO, see Wunsch et al. (2009) ^{[1]}.
Some background: why adjoint models are good for you
Adjoint models come in very handy if you wish to compute the derivatives of a scalarvalued output function with respect to many(!) input variables, i.e. when seeking a highdimensional gradient. Two applications that come readily to mind are

Sensitivity analysis  an example:
You would like to know how Greenland or Antarctic total ice sheet volume (a scalarvalued model diagnostic) changes when perturbing basal sliding, or basal melt rate, or geothermal flux, or precipitation, or the initial temperature field at any(!) grid point in your domain. Not only do you need to compute the derivative with respect to each of these variables, but also each of these variables spans a twoor threedimensional space. Recently, we have applied the adjoint of the threedimensional ice sheet model SICOPOLIS to Greenland ice sheet volume sensitivities ^{[2]}. 
Optimal control, inverse modeling, parameter or state estimation, data assimilation:
The general method here is to fit a model to a given set of observations by adjusting a set of uncertain variables (socalled control variables), which can be model parameters, surface/basal/lateral boundary conditions, or initial conditions. Often these problems are solved iteratively via gradientbased optimization, i.e. the gradient of a leastsquares model vs. data misfit function is sought, and used in conjunction with gradient descent methods, such as the conjugate gradient method or Newton's method to reduce the misfit. A classic paper which introduces control methods to glaciology is by Douglas MacAyeal (1992) ^{[3]}
Both examples show that the gradient of the model is a key ingredient.
In the following we shall be concerned with
 making clear why the adjoint (and not the tangent linear) model is what we want;
 how to get an adjoint for a complicated model in the form of a Fortran code using automatic differentiation (AD).
A very simple example
The following simple example should help explain some very basic issues of adjoint models:
Some algebra
These notes expose how the gradient of can be computed via the tangent linear or the adjoint model. They discuss how things change when changing the control space from initial conditions to model parameters , and that the notion of the adjoint model is illconceived.
The corresponding Fortran code
Example codes, scripts to invoke the AD tool, and to compile are contained in this tarfile. Download, place in your home directory ~/ and untar via
# uncompress, untar in your home directory cd ~ tar xzf Adjoint_example.tar.gz # check simple README more adjoint_README # cd to simple example code, using x_1, x_2 as controls cd ~/adjoint_example/simple_function/control_init/ # cd to simple example code, using a, b as controls cd ~/adjoint_example/simple_function/control_param/
The simple tangent linear model
A more complex example: Kees' assignment of a mountain glacier model
Earlier in the school we developed a model for a mountain glacier (see Kees' assignment). Each of the six teams came up with a solution. From these I chose the Team 2 Solution (because it was posted on the WIKI, and because it's very compact) and used AD to generate a tangent linear (TLM) and adjoint (ADM) model.
I formulated the following control problem:
 dependent variable / cost: The "total volume", i.e. over all
 independent variable / control: a perturbation in the mass balance at any point
Here are codes for the slightly modified original code, the adjoint code, the tangent linear code, and the driver routine. The latter calculates the gradient of total volume with respect to changes in mass balance at each point using (1) the adjoint model, (2) the tangent linear model, and (3) via finitedifference perturbations. The f.d. calculation serves to test the results calculated via ADgenerated code.
Remember that we have to run the TLM and f.d. model 31 times, corresponding to the number of grid point, i.e. the dimension of , whereas we have to run the ADM only once (check this carefully in the driver program). Also, the f.d. model only uses the original forward code, it doesn't rely on any of the ADgenerated codes.
Mountain glacier forward model (very slightly modified from the Team 2 Solution)
Mountain glacier adjoint model
Mountain glacier tangent linear model
Mountain glacier driver routine
Mountain glacier sensitivity result
Adjoint variables and Lagrange multipliers: some algebra
Adjoint methods are synonymous to what is better known in many fields as Lagrange multiplier methods. In a section on Lagrange multipliers and adjoints we spend some time to develop their relationship. A more complete treatment can be found, e.g. in Wunsch (2006) ^{[4]} or MacAyeal and Barcilon (1998) ^{[5]}. The paper by Giering and Kaminski (1998) ^{[6]} offers a perspective based purely on application of the chain rule.
A (incomplete) list of AD tools
There are quite a few AD tools around. We list a few of them which we think have significant/relevant reverse mode capabilities that are necessary to generate adjoint models. A more complete forum on automatic differentiation is the Community Portal for Automatic Differentiation autodiff.org. The authoritative textbook on AD is by Griewank and Walther (2008)^{[7]}
 TAF (Transformation of Algorithms in Fortran), developed by Fastopt, Hamburg, Germany
 OpenAD (OpenSource AD tool), developed at Argonne National Laboratory, Chicago, IL
 Tapenade, developed at INRIA, Sophia Antipolis, France
References
 ↑ Wunsch, C., P. Heimbach, R. Ponte, I. Fukumori and the ECCOGODAE Consortium members, 2008: The global general circulation of the ocean estimated by the ECCO Consortium. Oceanography, 22(2), 88103
 ↑ Heimbach, P. and V. Bugnion, 2009: Greenland ice sheet volume sensitivity to basal, surface, and initial conditions, derived from an adjoint model. Annals of Glaciology, 50(52), pp. 6780
 ↑ MacAyeal, D.R., 1993: A tutorial on the use of control methods in icesheet modeling, J. Glaciol. 39(131), pp. 9198.
 ↑ Wunsch, C., 2006: Discrete Inverse and State Estimation Problems: With Geophysical Fluid Applications. Cambridge University Press.
 ↑ MacAyeal, D.R. and V. Barcilon, 1998: Finding Connections Between Data and Theory: Applications in Geophysical Sciences. University of Chicago, Chicago, IL pdf
 ↑ Giering, R. and T. Kaminski, 1998: Recipes for adjoint code construction. ACM Transactions on Mathematical Software, 24, pp. 437474 pdf
 ↑ Griewank, A. and A. Walther, 2008: Evaluating Derivatives. Principles and Techniques of Algorithmic Differentiation (2nd ed.). SIAM Frontiers in Applied Mathematics, Vol. 19, Philadelphia, 2008.