# How to make a model

My idea behind writing this wiki page and program example is to bring many elements of the course together in one big example. There is the actual numerical model, Kees' assignment, which needs to be tied to proper in- and output, done in netCDF (Representing_and_manipulating_data) and some plotting routine done with matplotlib. Actually putting all this together will generate quite a few lines of code and so take some time to do it (or work through it) but it is well worth it in the long run. I write this wiki page as I go along, documenting my worksteps. The project is under version control in (get it with svn checkout http://source.ggy.bris.ac.uk/subversion-open/subversion/trunk/group_3/mauro/a_full_model and I will give the revision numbers of the different steps. This means that you can get the code at that revision by typing svn updata revisio_number. (Sorry, but I only noticed after revision 89 that my folder is quite a mess, so the first few revisions contain quite a few unused files, don't get confused)

Looking back on my PhD some instructions like this could have been a great help for me (assuming they are understandable), saving me loads of time (e.g. trying to figure out what exactly were the model parameters producing this nice model output in file xy). There are some semi-advanced programming techniques used here, mainly object oriented stuff, so don't get scared by this. Well, hope this helps.

## Contents

### Source code

```svn checkout http://source.ggy.bris.ac.uk/subversion-open/subversion/trunk/group_3/mauro/a_full_model
```

# The reason The typical look of a directory after performing some runs of a numerical model.

# Components

Possible components of a numerical model (not complete list)

1. Actual model
1. Numerics to solve equations
2. Boundary conditions
3. Numerical implementation
2. Data handling
2. Storage
3. Model evaluation
1. Plotting
2. Fitting

# Model construction

I will use the shallow ice sheet model of [Kees' assignment] and build data-handling, input-output and plotting around it. I use the Fortran code of Team_3_Solution as the part doing the actual numerics (corresponding to the point 1 above) and use Python for point 2 and 3 above.

## Skeleton

revision 81

Let's start with building the skeleton in python:

### A class for the model

```"""
This is a sample class for setting up, running and output of a model.
In this instance a shallow ice sheet model.
"""

# standard libraries
import sys
import os

# numerics
import numpy as N
import pylab as P

class SIA_model(object):

def __init__(self, conf_or_netcdf_file):
"""The standard initialisation function.

@type conf_or_netcdf_file: string
@param conf_or_netcdf_file: A config file or a former netCDF
output file.
"""

if conf_or_netcdf_file[-3:]=='.nc':
# if a .nc file read the netCDF file
elif conf_or_netcdf_file[-4:]=='.CDL':
# if a .CDL file read the CDL file

"""
Read the configuration file of the model.
"""
pass

"""Reads all the input files:
- bed
- initial thickness
- mass balance distribution
"""
pass

"""
Reads a netCDF file (as saved by a previous model run) and
puts everything into place, such that the model run can be
contiued.
"""
pass

def write_netCDF(self):
"""
Write all the calcualted data and also metadata to a netCDF
file.
"""
pass

def run_model(self):
"""
Runs the model according to the setup gathered from
"""
pass

def plot_thickness(self, time):
"""
Plot the ice thickness at a particular time.

@type time: float
@param time: The time when to plot the solution.
"""
pass```

So this does very little so far but defines what needs to be done. Note that I use object oriented programming style. If you're not familiar with it here a brief summary what it does: a class instance is similar to a Fortran derived type (or a C struct, or a Matlab structure) but added to the data storage there are also associated functions (so called methods) which can operate on the data contained in a class instance. Now the class is the building plan for class instances and several such instances can be created of a class, each containing different data, however the methods stay the same. For some more in depth discussion of this, see e.g. .

Note: I use the epydoc library which helps creating documentation. If you wonder what the strange things like @param refer to that web site.

### Where to put the data

revision 88

The data and metadata of the program will be stored similarly to how the Scientific Python netCDF does it (Note: I don't like this module so much, but that's what we're using here in the summer school. I recommend , looks much the same but much better as it is compatible with numpy.). This will make it much easier to write the model data to a netCDF file afterwards.

A class for dimensions is defined:

```class Dimension(object):
"""
A class to hold a dimension.

Note that each (netCDF) variable has a number of dimensions
associated with it, i.e. a touple containing instances of this
class.  The dimension should have a closely associated variable,
I{coordinate variable} which needs to have the same name as the
dimension.
"""

def __init__(self, dimname, size):
"""
@type dimname: string
@param dimname: The standart name according to our
conventions U{file:../datanotes/standard_names.muse}.

@type size: int
@param size: The length of the coordinate variable, if set to None,
then an UNLIMITED dimension is assumed.
"""
if isinstance(size, netCDF3.Dimension):
self.from_netCDF3_dim(dimname, size)
return
self.name = dimname
self.size = size```

and a class for variables

```class Variable(object):
"""
A class to hold variable data and attributes, similar to the
C{Scientific.io.netcdf} variables.
"""

def __init__(self,varname, datatype='f8',  dimensions=(), attrs={}):
"""
@type varname: string
@param varname: Name of the variable.

@type datatype: string
@param datatype: One of:
- 'f4' (32-bit floating point)
- 'f8' (64-bit floating point)
- 'i4' (32-bit signed integer)
- 'i2' (16-bit signed integer)
- 'i1' (8-bit signed integer)
- 'S1' (single-character string)

@type dimensions: a touple of strings
@param dimensions: what dimensions are associated with this
variable, if () then its scalar.  The
UNLIMITED dimension should come first, e.g.:
('time', 'X', 'Y', 'Z')

@type attrs: dict
@param attrs: a dictionary holding all the attirbutes of the
variable.
"""
#: the variable name
self.name = varname
#: the datatype
self.datatype = datatype
#: netCDF CF-1.4 attributes
self.attrs = attrs

self.dimensions = dimensions
#: the acutal data in a numpy array (this could probably be done better)
self.data = N.array([])```

Each variable (at least non-scalar ones) will have one or more dimensions associated with it. See Representing_and_manipulating_data.

We modify the __init__ of the SIA_model class to take that into account:

```class SIA_model(object):
"""
The class doing all the model setup, running, output and some
plotting.
"""
def __init__(self, conf_or_netcdf_file):
"""The standard initialisation function.

@type conf_or_netcdf_file: string
@param conf_or_netcdf_file: A config file or a former netCDF
output file.
"""

if conf_or_netcdf_file[-3:]=='.nc':
# if a .nc file read the netCDF file
elif conf_or_netcdf_file[-4:]=='.CDL':
# if a .CDL file read the CDL file

#: dict holding the L{Dimension} instances
self.dimensions = {}
#: dict holding the L{Variable} instances
self.variables = {}
#: dict holding the global attributes
self.global_attrs = {}
.
.
.```

## Get input data into the program

### Where is our input data?

revision 89

To run the model (i.e. the Fortran code) we need to have input data:

• bed topography
• initial ice thickness
• Mass balance distribution

I get these from netCDF files which I write by hand(-ish) in the CDL text formatRepresenting_and_manipulating_data#CDL. I put here the example of the bed topography (note the use of the CF standard):

```netcdf bedrock_altitude_example {
dimensions:
X = 51 ;
variables:
double X(X) ;
X:standard_name = "X" ;
X:units = "m" ;
float bedrock_altitude(X) ;
bedrock_altitude:standard_name = "bedrock_altitude" ;
bedrock_altitude:units = "m" ;

// global attributes:
:comment = "Sample input bedrock" ;
:title = "Bedrock topography" ;
:Conventions = "CF-1.4";
data:

X =      0.,   1000.,   2000.,   3000.,   4000.,   5000.,   6000.,
7000.,   8000.,   9000.,  10000.,  11000.,  12000.,  13000.,
14000.,  15000.,  16000.,  17000.,  18000.,  19000.,  20000.,
21000.,  22000.,  23000.,  24000.,  25000.,  26000.,  27000.,
28000.,  29000.,  30000.,  31000.,  32000.,  33000.,  34000.,
35000.,  36000.,  37000.,  38000.,  39000.,  40000.,  41000.,
42000.,  43000.,  44000.,  45000.,  46000.,  47000.,  48000.,
49000.,  50000.;

bedrock_altitude =    0.        ,   -25.00000037,   -50.00000075,   -75.00000112,
-100.00000149,  -125.00000186,  -150.00000224,  -175.00000261,
-200.00000298,  -225.00000335,  -250.00000373,  -275.0000041 ,
-300.00000447,  -325.00000484,  -350.00000522,  -375.00000559,
-400.00000596,  -425.00000633,  -450.00000671,  -475.00000708,
-500.00000745,  -525.00000782,  -550.0000082 ,  -575.00000857,
-600.00000894,  -625.00000931,  -650.00000969,  -675.00001006,
-700.00001043,  -725.0000108 ,  -750.00001118,  -775.00001155,
-800.00001192,  -825.00001229,  -850.00001267,  -875.00001304,
-900.00001341,  -925.00001378,  -950.00001416,  -975.00001453,
-1000.0000149 , -1025.00001527, -1050.00001565, -1075.00001602,
-1100.00001639, -1125.00001676, -1150.00001714, -1175.00001751,
-1200.00001788, -1225.00001825, -1250.00001863 ;
}

```

### Reading the data into our program

revision 90

I made two small function to turn CDL into netCDF files and vice versa, they are in helper_module.py.

revision 91'

Now I need to write a routine to read netCDF file in, I do this with the Scientific-Python module, as discussed in Representing_and_manipulating_data#netCDF_and_python. I just make a small function, again in helper_module.py to do that.

Ok, all good, now I just need to stick that into the SIA_model class. Hmm, how would this be done best? In Glimmer they got these nice config files which specify all the input stuff and such. Let's do that.

# Side trip: config file

Config files are nice for humans and such, as they give an easy way of specifying with what parameters, input files, output files etc. we want to run the model a bit. I will try to make such a config file as a netCDF/CDL file as well. If this is possible, this will save me to write a parser that would parse my custom made config file. This solution is not quite ideal, but I'll try if it works.

So I put this togther:

```netcdf configuration {
dimensions:
// for the CDL file to be valid, we need to define a dummy dimension and variable.
dummy = 1;
variables:
double dummy(dummy);

// global attributes:
// these are the 'normal' global attributes
:comment = "Sample input bedrock" ;
:title = "Bedrock topography" ;
:Conventions = "CF-1.4";

// input: all input parameters are prepended by an input_
// paths are relative to one directory higher (cd ..)
:input_bed = "input_data/bed.nc";
:input_mass_balance = "input_data/mass_balance.nc";
:input_thick_IC = "input_data/thick_IC.nc";
// output
:output_filename = "output_data/test.nc";
// physical parameters
:g_grav = 9.81;    // gravitational acceleration
:rho_i = 920.;     // density of ice
:A_rate = 1e-16;   // ice rate factor
:n_glen = 3.;      // flow law exponent.

// model parameters:
// BC:
:BC_thickl = 0.;   // ice thickness at left edge of domain
:BC_thickr = 0.;   // ice thickness at right edge of domain
// numerical parameters
data:
dummy = _ ;
}
```

Note that I had to define a dummy variable to make this a valid netCDF/CDL file. I also put some stuff in which I thought will be used in the future. But the main thing is for now that I specified the input files.

Next steps:

• read the config file
• read the input files and put their stuff into the right place

## Reading the config file

revision 92'

I now (started to) write the method read_model_config:

```    def read_model_config(self):
"""
Read the configuration file of the model.
"""
tmp_file = 'tmp/config.nc'
# first remove old files from the tmp folder
os.system('rm '+ tmp_file)
# turn the CDL file into a .nc file
helper_module.CDL2nc(self.input_file, tmp_file)

## now comes the hard bit: fill the class with
## the specified data

# global attrs: just set them to what they are in the config
# file
self.global_attrs = config_file.__dict__

# read the input files and put their data into the appropriate
# place
for key in self.global_attrs:
if key.startswith('input_file'):
self.global_attrs[key])
# here we assume that the dimension (X) is the same for all
# input and of the same length.  Thus set the dimension
# to that length:
if self.dimensions=={}:
self.createDimension(
'X', input_file.dimensions['X'])
# and the corresponding coordinate variable
X = input_file.variables['X']
self.createVariable('X', X.typecode(),
dimensions=('X',), attrs=X.__dict__)
# and set the data
self.variables['X'].data = X[:]

# add the variable
for kk in input_file.variables.keys():
if not kk=='X':
var = input_file.variables[kk]
self.createVariable(kk, var.typecode(),
dimensions=('X',),
attrs=var.__dict__)
# and set the data
self.variables[kk].data = var[:]

# note that we just discarded the global attributs of
# the input files

# finally, close the config file
config_file.close()```

I also made some slight changes to __init__ and added two methods and , both mirroring Scientific.IO.NetCDF.

revision 93 Now let's see if our code runs so far, in the script test_sia_model.py I make an instance of the SIA_model class. This instance then automatically runs its method __init__ and get initialised. I then plot the bedrock profile.

```import sia_model as sia
import pylab as P

model = sia.SIA_model("input_data/config.CDL")

print model.variables['X'].data
print model.variables['bedrock_altitude'].data

P.plot(model.variables['X'].data,  model.variables['bedrock_altitude'].data)
P.show()```

### Plot the IC

revison 94

In fact, lets make a method of the SIA_model class which plots the IC, bedrock and mass balance distribution: plot_IC

```    def plot_IC(self):
"""
Plot the ice thickness at t0, together with the bed and mass
balance distribution.
"""
fig = P.figure()
ax1 = P.subplot(2,1,1)
P.plot(self.variables['X'].data,
self.variables['land_ice_thickness'].data +
self.variables['bedrock_altitude'].data , 'b',
label=self.variables['land_ice_thickness'].attrs['standard_name'])
P.hold(True)
P.plot(self.variables['X'].data,
self.variables['bedrock_altitude'].data, 'r',
label=self.variables['bedrock_altitude'].attrs['standard_name'])
P.xlabel(self.variables['X'].get_axis_label())
P.ylabel('Elevation')
P.legend()

ax1 = P.subplot(2,1,2)
P.plot(self.variables['X'].data,
self.variables['land_ice_surface_specific_mass_balance'].data)
P.xlabel(self.variables['X'].get_axis_label())
P.ylabel(self.variables['land_ice_surface_specific_mass_balance'].get_axis_label())```

I changed the script test_sia_model.py to do the plotting: so running python test_sia_model.py will make a plot.

# Running the numerics

There are a number of ways of how to run a Fortran program form python, e.g. os.system('path_to_fortran_prog'). However, such a simplistic approach makes passing data into and out of the Fortran program difficult. So, the strategy I'll adopt is to make the Fortran program callable from Python, this is explained in detail in f2py_example, so please refer to that.

## Changing the fortran code

revision 95 I changed the Fortran code a bit more such that all the model input can be passed to the python module. The API of the Fortran looks like this then:

```subroutine simple_ice_model(grid, dt, t_steps, n_t_num, t_out_ind, dx, & ! numerics:
! spatial grid size, time step, total number of time steps, total number of output times, index of output times, spatial discretisation step
rho, g_grav, A_rate, n_glen, & ! physical constants
xx, mass_b, bed, & ! input on grid: the grid, mass balance, bed elevation
thick, & !IC on grid: initial ice thickness
thickr, thickl, & ! BC: thickness at left and right
thick_out) ! output: the thickness at the specified time steps```

The whole program and all the other bits can be found in the folder <tt>fortran. The signature file of the module can be made with python make_signature_file.py. It (kees_simple_ice_model.pyf) needs to be edited: comment out all the subroutines of the kees_simple_ice_model_modules.pyf. Make the python module with python make_module.py.

### Calling the Fortran from Python

To call the Fortran module from Python it needs to be imported:

```sys.path.append('fortran')
import kees_simple_ice_model```

It can then be called thus:

`kees_simple_ice_model.simple_ice_model(list of arguments)`

So let's call it, I do this in the method run_model of the class SIA_model. So for now it looks like:

revison 96

```    def run_model(self):
"""
Runs the model according to the setup gathered from
"""
##      Calling sequence:
#         subroutine simple_ice_model(grid, dt, t_steps, n_t_num, t_out_ind, dx, & ! numerics:
#                                     ! spatial grid size, time step, total number of time steps, total number of output times, index of output times, spatial discretisation step
#                                     rho, g_grav, A_rate, n_glen, & ! physical constants
#                                     xx, mass_b, bed, & ! input on grid: the grid, mass balance, bed elevation
#                                     thick, & !IC on grid: initial ice thickness
#                                     thickr, thickl, & ! BC: thickness at left and right
#                                     thick_out) ! output: the thickness at the specified time steps
X = self.variables['X']
dx = X-X
time = self.variables['time']
output_thickness = kees_simple_ice_model.simple_ice_model(
len(X), self.g_attrs['dt'], len(time), xxx, dx,
self.g_attrs['rho'], self.g_attrs['g_grav'], self.g_attrs['A_rate'], self.g_attrs['n_glen'],
X, self.variables['land_ice_surface_specific_mass_balance'], self.variables['bedrock_altitude'],
self.g_attrs['BC_thickl'], self.g_attrs['BC_thickr'] )```

Well, there are still a few things amiss:

1. not all numerical stuff is defined in the config file
2. make a time dimension and variable
3. need to calcualte xxx (above): the indices of the time step when output should be written to the output variable.

#### 1. config file

revision 97

We need include some more data in the config file, so added this:

```// numerical parameters
:dt = 0.1;         // the time step inside the solver
// these are for constructing the output time variable
:t_start = 0.;      // start time
:t_final = 1000.;   // final time
:t_output_dt = 10.; // time interval between output times
:t_units = 'a';     // units of the time variable
```

### 2. & 3.

And the method read_model_config needs updating, mainly the time variable needs to be created:

```    def read_model_config(self):
"""
Read the configuration file of the model.
"""
.
.
.

## make the time dimension and variable:
time = N.arange(self.g_attrs['t_start'], self.g_attrs['t_final'],
self.g_attrs['t_output_dt'])
self.createDimension('time', len(time))
time_attrs = {'units': self.g_attrs['t_units'], 'standard_name': 'time'}
self.createVariable('time', 'd', ('time',), time_attrs)

## make the output variable
thick_attrs = {'units': 'm', 'standard_name': 'land_ice_thickness'}
self.createVariable('thick', 'd', ('time', 'X'), thick_attrs)

# for the fortran module to know when ot output we make t_out_ind:
# the indices of the time steps (dt) when to output
time_steps_between_output = N.round(self.g_attrs['t_output_dt']/
self.g_attrs['dt'])
index_of_last_output = N.round(self.g_attrs['t_final']/
self.g_attrs['dt'])
index_of_first_output = 0
self.t_out_ind = range(index_of_first_output, index_of_last_output,
time_steps_between_output)```

## Some debugging

I now tried to run the model, but it ain't working: this is a bug, in fact several. Which are fixed in revision 98. Use a difference tool to find out where things changed.

## Plot solution

revision 99

So I add a method to plot:

```    def plot_thickness(self, time):
"""
Plot the ice thickness at a particular time on top of the bed.

@type time: float
@param time: The time when to plot the solution.
"""
# get what index is the solution at
ind = N.where(time>self.variables['time'].data)[-1]

fig = P.figure()
P.plot(self.variables['X'].data,
self.variables['bedrock_altitude'].data +
self.variables['thick'].data[:,ind].T, 'b')
P.hold(True)
P.plot(self.variables['X'].data,
self.variables['bedrock_altitude'].data, 'r',
label=str(self.variables['bedrock_altitude'].attrs['standard_name']))
P.xlabel(self.variables['time'].get_axis_label())
P.ylabel('Elevation')
P.legend('at time %.0f' % time)```

And run it, plots a solution, but not the right one (all zero)!

## Some more debugging

In the Fortran code, in the integration loop, the condition when to do output was wrong. I now implement this a bit differently, as well as BC handling was wrong:

```.
.
.
if (mod(ii-1, n_t_num)==0) then
thick_out(:,iii) = thick
times_out(iii) = time
iii = iii+ 1
end if
.
.
.
! BC
thick(1) = thickl
thick(grid) = thickr
.
.
.```

And with this there is a slight change in the read_model_config.

## Plotting results

revision 100

Now running the script test_sia_model.py produces the almost right plots. Notice that plotting the IC shows a glacier even though it should be zero: apparently the variable passed to the Fortran module is overwritten there in place. Let's send a copy to the Fortran module, this is done with copy Python module. revions 101

# Output

Cool, what I got so far:

• model setup with a config file
• input of model from netCDF files
• numerics done in Fortran which is used in Python as a module
• plotting of results

revision 104 So, I still have to do output to netCDF files. This is done in the (now still empty) method write_netCDF. The data is already in a quite nice order, so this should not be too difficult:

```    def write_netCDF(self):
"""
Write all the calcualted data and also metadata to a netCDF
file.
"""
# open the file specified in the config file
netCDFfile = helper_module.open_netCDF_into_python(
self.global_attrs['output_filename'], 'w')
# now follow U{http://gfesuite.noaa.gov/developer/netCDFPythonInterface.html}

## global attributs
for kk in self.global_attrs:
# this doggy stuff needs to be done because of Sientific
# using Numeric arrays and not numpy arrays
try:
setattr(netCDFfile, kk, Numeric.array(self.global_attrs[kk]))
except ValueError:
setattr(netCDFfile, kk, str(self.global_attrs[kk]))

## first the dimensions
for kk in self.dimensions:
netCDFfile.createDimension(kk, self.dimensions[kk].size)
## now the variables
for kk in self.variables:
var = self.variables[kk]
for kkk in var.attrs:
# see comment above
try:
setattr(netCDFfile, kkk, Numeric.array(var.attrs[kkk]))
except ValueError:
setattr(netCDFfile, kkk, str(var.attrs[kkk]))
netCDFvar = netCDFfile.createVariable(kk, var.datatype, var.dimensions)
# fill in the data
netCDFvar[:] = var.data
## write to disk
netCDFfile.sync()```

Hey, pretty easy. (Remember what they said how they do it in Glimmer: in Fortran it takes so much code to write that they came up with a Python program to autogenerate the Fortran code. No need for that here.)

Note: I did some more bug fixing, both in the Fortran and python: some length of variables was not quite right, etc, the usual. See with some diff tool what was buggy.

# Version control

revision 105

It would be pretty neat and useful, if with every stored netCDF file a version number of our program is stored. This way one can go back to the version a file was produced, as often data-files of later versions will not be compatible with earlier ones.

```    def get_svn_verison_number(self):
"""Return the version of the current svn revision.

@note: To actually show the revision number a update needs to
be done sometime before.
"""

status, output = commands.getstatusoutput("svn info")
ll = output.split()
ind = ll.index('Revision:')
return ll[ind+1]```

And I include it in the netCDF output by having this line in __init__:

`        self.global_attrs['svn_revison'] = self.get_svn_verison_number()`

Of course, take care to check in the source before running the model. This should probably also be automated, in some model runner script: another thing to be done.

# Makefile

We should have one doing:

• compiling the Fortran module
• make the documentation

# Documentation

I used some special formatting in the doc-strings. This can be processed by the epydoc program to generate html files which describe the program. Need to figure how to include this in this wiki.

# Other stuff which could be included

• make a script doing model runs, systematically sia)runner.py
• hot start capability: i.e. start a model run with input from a former one
• more numerics: fitting, Bayesian, adjoint and such stuff
• data assimilation: the input will not always be on the grid where our computation is
• automated code testing (e.g. unit testing)
• etc

• EPYDOC