FADBAD++

sample

Flexible Automatic differentiation using templates and operator overloading in C++

Introduction:

FADBAD++ implements the forward, backward and Taylor methods utilizing C++ templates and operator overloading. These AD-templates enable the user to differentiate functions that are implemented in arithmetic types, such as doubles and intervals. One of the major ideas in FADBAD++ is that the AD-template types also behave like arithmetic types. This property of the AD-templates enables the user to differentiate a C++ function by replacing all occurrences of the original arithmetic type with the AD-template version. This transparency of behavior also makes it possible to generate high order derivatives by applying the AD-templates on themselves, enabling the user to combine the AD methods very easily.

Content:

Authors:

FADBAD++ were developed by Ole Stauning and Claus Bendtsen. Send any comments and questions to <info@fadbad.com>. We are also very interested in hearing if you have used FADBAD++ in your own application(s).

Copyright:

FADBAD++ is copyright 1996-2007 Claus Bendtsen and Ole Stauning. You are free to use FADBAD++ for non-commercial purposes. You should contact the authors if you wish to use FADBAD++ in a commercial product. See also the copyright message which is distributed with the source code.

Download source code and documentation for FADBAD++:

FADBAD++ can be used right away - without any compiling or configuring. FADBAD++ has been tested with GCC 4.1.0, Microsoft Visual C++ 7.10 and 8.10, Sun C++ 5.8, Intel C++ 9.1.
Notice that this web-page describes the latest version.

General introduction to automatic differentiation and FADBAD++:

The importance of differentiation as a mathematical tool is obvious. One of the first things we learn in elementary school is how to manually differentiate expressions using a few elementary formulas. Unfortunately the use of derivatives in scientific computing has been quite limited due to the misunderstanding that derivatives are hard to obtain. Many people still think that the only alternative to the symbolic way of obtaining derivatives is to use divided differences in which the difficulties in finding an expression for the derivatives are avoided. But by using divided differences, truncation errors are introduced and this usually has a negative effect on further computations - in fact it can lead to very inaccurate results.

The use of a symbolic differentiation package such as Maple or Mathematica can solve the problem of obtaining expressions for the derivatives. This method obviously avoids truncation errors but usually these packages have problems in handling large expressions and the time/space usage for computing derivatives can be enormous. In worst case it can even cause a program to crash. Furthermore, common sub expressions are usually not identified in the expressions and this leads to unnecessary computations during the evaluation of the derivatives.

Automatic differentiation is an alternative to the above methods. Here derivatives are computed by using the very well known chain rule for composite functions, in a clever way. In automatic differentiation the evaluation of a function and its derivatives are calculated simultaneously, using the same code and common temporary values. If the code for the evaluation is optimized, then the computation of the derivatives will automatically be optimized. The resulting differentiation is free from truncation errors, and if we calculate the derivatives using interval analysis we will obtain enclosures of the true derivatives. Automatic differentiation is easy to implement in languages with operator overloading such as C++ and C#.

FADBAD++ contains templates for performing automatic differentiation on functions implemented in C++ code. If the source code of a program, which is an implementation of a differentiable function, is available, then FADBAD++ can be applied to obtain the derivatives of this function. To apply automatic differentiation on a program, the arithmetic type used in the program (normally double) is changed to an overloading type ( normally F<double>, B<double> or T<double> ). Since the overloading type behaves just like the usual arithmetic type and the functionality therefore is kept, the program, which performs the function evaluation, should not be changed in any other way. When calling the program, the user should specify what output variables, returned from the program, to differentiate, and what input variables to differentiate with respect to. Since a program using automatic differentiation, itself can be differentiated using FADBAD++; it is possible to obtain higher order derivatives by applying multiple layers of automatic differentiation. Any type of combinations of the automatic differentiation methods that are implemented in FADBAD++ is possible. This way derivatives can be obtained in several ways, making it possible to optimize with respect to the time and space used in the process.

Examples of applications:

Forward automatic differentiation on a function f : Rn-> Rn, evaluated in interval arithmetic, using the BIAS/PROFIL package, to obtain function values and derivatives. Used with the interval Newton method to obtain guaranteed enclosures of all solutions to the non-linear equation f(x)=0.

Forward-Backward automatic differentiation on a function f : Rn->R, evaluated in interval arithmetic, using the BIAS/PROFIL package and the backward method to obtain first order derivatives (the gradient) and the forward method to differentiate the first order derivatives to obtain the second order derivatives (the Hessian). Used with the interval Krawczyk method to perform global optimization obtaining a guaranteed enclosure of the global minimum.

Numerical integration of a function f : RxRn->R, using a three-point-two-derivative formula. The Backward method has been used to differentiate the Numerical integration program obtaining the n partial derivatives of the integral with respect to the n parameters in the function.

Taylor expansion of the solution to an ordinary differential equation, used to solve initial value problems. The Forward method has been used to differentiate the initial value problem solver to obtain the solution of the variational problem.

Taylor expansion of the solution to an ordinary differential equation using interval arithmetic and using the Forward method to obtain derivatives of the Taylor coefficients with respect to the point of expansion, which are the values of the Taylor coefficients for the solution of the variational problem. Used in a method that solves initial value problems with guaranteed enclosures.

Crash course in using FADBAD++:

FADBAD++ works by overloading all elementary arithmetic operations to include calculation of derivatives. To enable these overloaded operations we have to change the arithmetic type (normally double) to the appropriate AD-type. The AD-types are defined by the three templates F<> for forward automatic differentiation, B<> for backward automatic differentiation and T<> for Taylor expansion.The AD-types are implemented in the header files fadiff.h, badiff.h and tadiff.h and including these files will define the templates needed in the fadbad-namespace. In order to use the F<> type the start of the source file should include the following lines:
#include "fadiff.h"
using namespace fadbad;
Now we are ready to differentiate functions by using the forward method.
Forward automatic differentiation
Lets see what that looks like when using forward automatic differentiation. Assume that we have a function, func, that we want to differentiate.
double func(const double& x, const double& y)
{
double z=sqrt(x);
return y*z+sin(z);
}
Do a search and replace on "double" with "F<double>". Our code now looks like:
F<double> func(const F<double>& x, const F<double>& y)
{
F<
double> z=sqrt(x);
return y*z+sin(z);
}
Our function is now prepared for computing derivatives. Before we call the function we have to specify the variables we want to differentiate with respect to.  After the call we obtain the function value and the derivatives. This can be done with the following code:
	F<double> x,y,f;     // Declare variables x,y,f
x=1; // Initialize variable x
x.diff(0,2); // Differentiate with respect to x (index 0 of 2)
y=2; // Initialize variable y
y.diff(1,2); // Differentiate with respect to y (index 1 of 2)
f=func(x,y); // Evaluate function and derivatives
double fval=f.x(); // Value of function
double dfdx=f.d(0); // Value of df/dx (index 0 of 2)
double dfdy=f.d(1); // Value of df/dy (index 1 of 2)

cout << "f(x,y)=" << fval << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;
Notice that the method diff(i,n) is called on the independent variables before the function call. Where n is the total number of independent variables that we want to differentiate with respect to and i denotes the index of the variable i=0...n-1. After the function call the actual function value is obtained from the dependent variable(s) with the x() method and the derivatives by the d(i) method, where i=0...n-1 is the index of the independent variable.
Backward automatic differentiation
Now, lets wee what this looks like when using backward automatic differentiation. Given the function func evaluated in doubles. We do a search and replace on "double" with "B<double>". Now the code looks like:
B<double> func(const B<double>& x, const B<double>& y)
{
B<
double> z=sqrt(x);
return y*z+sin(z);
}
The file "badiff.h" have to be included for defining the B<> template.

The code is now prepared for computing derivatives. Actually the function diff will now "record" a directed acyclic graph (DAG) representing the function, while computing the function value. After the function call we specify what variable we want to differentiate. When this has been done on all dependent variables we can obtain the derivatives. This is done with the following code:
	B<double> x,y,f;    // Declare variables x,y,f
x=1; // Initialize variable x
y=2; // Initialize variable y
f=func(x,y); // Evaluate function and record DAG
f.diff(0,1); // Differentiate f (index 0 of 1)
double fval=f.x(); // Value of function
double dfdx=x.d(0); // Value of df/dx (index 0 of 1)
double dfdy=y.d(0); // Value of df/dy (index 0 of 1)

cout << "f(x,y)=" << fval << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;
Notice that this time the method diff(i,m) is called on the dependent variable(s) after the function-call. Where m is the total number of dependent variables we want to differentiate and i denotes the index of the variable i=0..m-1. The function value is obtained by the x() method on the dependent variables, while the derivatives are obtained by calling the d(i) on the independent variables, where i=0...m-1 is the index of the dependent variable.

The DAG is deallocated automatically when the derivatives are being computed backwards through the DAG.

It is very important that diff(i,m) is called on all dependent variables, i.e. variables that are dependent on the variables we want to differentiate with respect to. If this is not done then the derivatives are not propagated correctly into the independent variables, causing the derivatives to be wrong when obtaining them by calling d(i).A way to make a variable independent is to assign a constant to it. E.g. f=0; in the above example. Another way to avoid temporary dependent variables is to let them go out of scope by using {...} or by defining a functions that encapsulates the evaluation and only returns the dependent variables of interest.

Since the backward method propagates the partial derivatives back to the independent variables (the input variables of the function), it is important to keep the these untouched. If these variables are used later, in the calculations, they become dependent variables themselves and we would then obtain the derivatives with respect to the intermediate variable in the DAG when propagating the derivatives.

Debug assertions in FADBAD++ can check if all derivatives have been propagated correctly and write out error messages to the console if this is not the case. These debug assertions are enabled by defining the "_DEBUG" symbol when compiling the code.
Simple Taylor expansion
Lets assume we want to Taylor expand the function func with respect to the variable x. We now do a search and replace on "double" with "T<double>" and get:
T<double> func(const T<double>& x, const T<double>& y)
{
T<
double> z=sqrt(x);
return y*z+sin(z);
}
Where the template T<> is defined in the file "tadiff.h". The function will now "record" a directed acyclic graph (DAG) while computing the function value (which is the 0'th order Taylor-coefficient). This DAG can then be used to find the Taylor coefficients. This is done in the following code:
	T<double> x,y,f;     // Declare variables x,y,f
x=1; // Initialize variable x
y=2; // Initialize variable y
x[1]=1; // Taylor-expand wrt. x (dx/dx=1)
f=func(x,y); // Evaluate function and record DAG
double fval=f[0]; // Value of function
f.eval(10); // Taylor-expand f to degree 10
// f[0]...f[10] now contains the Taylor-coefficients.
cout << "f(x,y)=" << fval << endl;
for(
int i=0;i<=10;i++)
{
double c=f[i];// The i'th taylor coefficient
cout << "(1/k!)*(d^"<<i<<"f/dx^" << i << ")=" << c << endl;
}
We manually initialize the 1'st order Taylor coefficient of x to the value 1 to denote the independent variable in which we want to expand the function. We can now expand the function by calling the eval(k) method on the dependent variable(s), where k is the order in which we want to expand the function.

The DAG is not de-allocated automatically (like in the backward method) when the Taylor coefficients have been computed by the eval(k) method. The DAG will instead hold all Taylor coefficients of degree 0...k of all intermediate variables for calculating the function. If we call eval(l), where l>k, the Taylor coefficients that have already been calculated will be reused and only the coefficients (k+1)..l will be calculated. This is useful when Taylor expanding the solution to an ordinary differential equation, which we will see later. The coefficients in the DAG can be deleted by calling the method reset() on the dependent variable(s). This is done in the following code:
	f.reset();           // Reset the values in the DAG
x[0]=3; // New value for x
y[0]=4; // New value for y
y[1]=1; // Taylor-expand wrt. y (dy/dy=1)
f.eval(10); // Taylor-expand f to degree 10
// f[0]...f[10] now contains the Taylor-coefficients.
for(int i=0;i<=10;i++)
{
double c=f[i];// The i'th taylor coefficient
cout << "(1/k!)*(d^"<<i<<"f/dy^" << i << ")=" << c << endl;
}
After the coefficients in the DAG has been deleted we can re-initialize the values of x and y (0'th order coefficients) and now specify that we want to expand with respect to the variable y.

Finally the DAG will be de-allocated when the independent variables is de-allocated.
Taylor expanding the solution of an ODE
The feature that the coefficients in the DAG is not de-allocated can be used for Taylor expanding the solution of an ODE: x'=f(x). Assume that we want to expand the solution of x'=cos(x). We can encapsulate the DAG of the function in an object by defining a class:
class TODE
{
public:
T<
double> x; // Independent variables
T<double> xp; // Dependent variables
TODE(){xp=cos(x);} // record DAG at construction
};
Objects of this class will have a DAG representing the ODE. The Taylor coefficients can then be computed one order at a time with the following code:
	TODE ode;                // Construct ODE:
ode.x[0]=1; // Set point of expansion:
for(int i=0;i<10;i++)
{
ode.xp.eval(i);
// Evaluate i'th Taylor coefficient
ode.x[i+1]=ode.xp[i]/double(i+1);// Use dx/dt=ode(x).
}
// ode.x[0]...ode.x[10] now contains the Taylor-coefficients
// of the solution of the ODE.

// Print out the Taylor coefficients for the solution
// of the ODE:
for(int i=0;i<=10;i++)
{
cout <<
"x[" << i << "]=" << ode.x[i] << endl;
}
We see that the ODE links the i'th coefficient of the dependent variable to the (i+1)'th coefficient of the independent variable. This dependency is specified explicitly in the loop.

Using the advanced features of FADBAD++:

We have now described the three basic methods for differentiating functions implemented as C++, the forward method, the backward method and Taylor expansion. But FADBAD++ offer more advanced features. These features will be described in the following sections.
Obtaining directional derivatives using the forward method
The directional derivative of a function f: Rn->R, along the a vector v : Rn can be derived by applying the chain rule on the function f(g(t)), where g(t)=x+vt. Hence df/dt=(df/dgi) (dgi/dt), where dgi/dt=vi. The forward method can be initialized with the latter equality when calling the .diff method:
for(int i=0;i<n;++i) x[i].diff(0,1)=v[i] 
Notice we now only differentiate with respect to one variable, t, to obtain the directional derivative instead of the n variables xi to obtain the partial derivatives and thereby we save some computations.

The method for obtaining the directional derivative can also be extended to the Taylor expansion method by initializing the 1'st order Taylor coefficients on the independent variables with the vector v.
Combinations of automatic differentiation
One of the very unique things of FADBAD++ is the ability to compute high order derivatives in a very flexible way by combining the methods of automatic differentiation. These combinations are produced by applying the templates on themselves. For example the type B< F< double > > can be used in optimization for computing first order derivatives by using the backward method and second order derivatives by using a backward-forward method. The combination T< F<double> > can be used to Taylor expand the solution of an ODE, while computing derivatives of the coefficients with respect to the point of expansion. The derivatives are then by definition the Taylor coefficients to the solution of the variational equation. All these possibilities are described in more detail in the documents which can be downloaded from this site and in the examples that comes with the source code of FADBAD++.
Avoiding search/replace by using template functions
When using different ways of differentiating the same function we need to instantiate the function with several arithmetic types. Fortunately, for us, we can implement the function as a template function, where the arithmetic type is a template which will be instantiated by the compiler to the types that are necessary for compiling the dependent code. Looking back at the function, func, we can implement it as a template based function:

template <typename T>
T func(const T& x, const T& y)
{
	T z=sqrt(x);
	return y*z+sin(z);
}

We can now use the function, func, in any arithmetic type that defines the used operations, including all the automatic differentiation types that are defined by FADBAD++.
Avoiding search/replace by using functors
Another, and more advanced, way of avoiding searching and replacing of types when applying automatic differentiation is by using functors:
struct Func
{
template <typename T>
T operator()(
const T& x, const T& y)
{
T z=sqrt(x);
return y*z+sin(z);
}
};
Objects of class Func can now be used to evaluate the function with different arithmetic types, which are determined at compile-time. We can even create forward- and backward differentiating functors that takes a functor-class as template arguments and differentiates it:
template <typename C>
struct FDiff
{
template <typename T>
T operator()(
T& o_dfdx, T& o_dfdy,
const T& i_x, const T& i_y)
{
F<T> x(i_x),y(i_y);
// Initialize arguments
x.diff(0,2); // Differentiate wrt. x
y.diff(1,2); // Differentiate wrt. y
C func; // Instantiate functor
F<T> f(func(x,y)); // Evaluate function and record DAG
o_dfdx=f.d(0); // Value of df/dx
o_dfdy=f.d(1); // Value of df/dy
return f.x(); // Return function value
}
};

template <typename C>
struct BDiff
{
template <class T>
T operator()(
T& o_dfdx, T& o_dfdy,
const T& i_x, const T& i_y)
{
B<T> x(i_x),y(i_y);
// Initialize arguments
C func; // Instantiate functor
B<T> f(func(x,y)); // Evaluate function and record DAG
f.diff(0,1); // Differentiate
o_dfdx=x.d(0); // Value of df/dx
o_dfdy=y.d(0); // Value of df/dy
return f.x(); // Return function value
}
};
The differentiating functors are used in the following code:
	double x,y,f,dfdx,dfdy;    // Declare variables
x=1; // Initialize variable x
y=2; // Initialize variable y
FDiff<Func> FFunc; // Functor for function and derivatives
f=FFunc(dfdx,dfdy,x,y); // Evaluate function and derivatives

cout << "f(x,y)=" << f << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;

BDiff<Func> BFunc;
// Functor for function and derivatives
f=BFunc(dfdx,dfdy,x,y); // Evaluate function and derivatives

cout << "f(x,y)=" << f << endl;
cout <<
"df/dx(x,y)=" << dfdx << endl;
cout <<
"df/dy(x,y)=" << dfdy << endl;
It is possible to make completely generic differentiating functors that differentiates any function f:Rn->Rm, by using STL vectors as input and output variables. See the test-code in FADBAD++.
Using FADBAD++ with other types than double
It is possible to change the underlying arithmetic type for all function and derivative calculations - also called the BaseType. Normally the underlying arithmetic type is double, but automatic differentiation is also very useful in the field of interval arithmetic. E.g. to obtain narrow enclosures by using the mean value form or in global optimization.
The templates that are defined in FADBAD++ requre that some standard operations are defined for the underlying type: comparison operators, simple operations "+", "-" "*", "/" and simple functions "sin", "cos", "exp", etc.

One of the main ideas of FADBAD++ is the ability of changing the underlying arithmetic type from double to e.g. intervals. This is done by specializing the Op<> template that are defined in the file fadbad.h. All operations that are used internally by FADBAD++ is mapped through this template. By specializing the template with a specific arithmetic class it is possible to configure which arithmetic operations should be used by FADBAD++ when doing automatic differentiation, based on the arithmetic type. In the following example we specialize the Op<> template for the arithmetic type MyType to use functions my_pow, my_sqrt and so forth.
#include "fadbad.h"
namespace fadbad
{
	template <> struct Op<MyType>
	{
		typedef MyType Base;
		static Base myInteger(const int i) { return Base(i); }
		static Base myZero() { return myInteger(0); }
		static Base myOne() { return myInteger(1);}
		static Base myTwo() { return myInteger(2); }
		static Base myPI() { return MyType(3.14159265358979323846); }
		static MyType myPos(const MyType& x) { return +x; }
		static MyType myNeg(const MyType& x) { return -x; }
		template <typename U> static MyType& myCadd(MyType& x, const U& y) { return x+=y; }
		template <typename U> static MyType& myCsub(MyType& x, const U& y) { return x-=y; }
		template <typename U> static MyType& myCmul(MyType& x, const U& y) { return x*=y; }
		template <typename U> static MyType& myCdiv(MyType& x, const U& y) { return x/=y; }
		static MyType myInv(const MyType& x) { return myOne()/x; }
		static MyType mySqr(const MyType& x) { return x*x; }
		template <typename X, typename Y>
		static MyType myPow(const X& x, const Y& y) { return ::my_pow(x,y); }
		static MyType mySqrt(const MyType& x) { return ::my_sqrt(x); }
		static MyType myLog(const MyType& x) { return ::my_log(x); }
		static MyType myExp(const MyType& x) { return ::my_exp(x); }
		static MyType mySin(const MyType& x) { return ::my_sin(x); }
		static MyType myCos(const MyType& x) { return ::my_cos(x); }
		static MyType myTan(const MyType& x) { return ::my_tan(x); }
		static MyType myAsin(const MyType& x) { return ::my_asin(x); }
		static MyType myAcos(const MyType& x) { return ::my_acos(x); }
		static MyType myAtan(const MyType& x) { return ::my_atan(x); }
		static bool myEq(const MyType& x, const MyType& y) { return x==y; }
		static bool myNe(const MyType& x, const MyType& y) { return x!=y; }
		static bool myLt(const MyType& x, const MyType& y) { return x<y; }
		static bool myLe(const MyType& x, const MyType& y) { return x<=y; }
		static bool myGt(const MyType& x, const MyType& y) { return x>y; }
		static bool myGe(const MyType& x, const MyType& y) { return x>=y; }
	};
}
The functions that are used in the Op<> template should be declared when compiling the code using automatic differentiation. If some functions are not defined for the underlying arithmetic type then it is possible to make a dummy implementation that just throws an exception. That way it will be discovered if the undefined function is used in the code.

It is possible to specialize the Op<> template for different arithmetic types and performing automatic differentiation using different underlying arithmetic types in the same sourcefile.
Implementing automatic differentiation for other functions
FADBAD++ can be extended with other functions than the built-in functions such as exp, sin, cos and so forth. These extensions can be implemented without changing any of the distributed source-files. In the "extra" directory with is distributed with FADBAD++ the normal distribution-, the cumulative normal distribution- and the inverse cumulative normal distribution functions have been implemented in the file ndf.h, while their automatic differentiation counterparts can be found in the file ndfad.h. Besides being useful when needing these functions, the code also demonstrates how functions can be overloaded to provide derivatives in the framework that are laid out by FADBAD++.
Using the stack-based forward method
All methods that we have described previously uses heap allocations to allocate memory for propagating the derivatives. These heap allocations can be quite costly compared to the actual derivative computations. However, if the number of partial derivatives to calculate is known when applying FADBAD++ in the implementation phase, then it is possible to use a stack-based forward method which is faster the the heap based version. The stack based forward method works by instantiating the F<> template with an extra argument which is the number of derivatives to obtain. Looking again at the function, func, we use the type F<double,2> to calculate the function and two partial derivatives:
F<double,2> func(const F<double,2>& x, const F<double,2>& y)
{
F<
double,2> z=sqrt(x);
return y*z+sin(z);
}
Since the type F<T,n> already "knows" the number of partial derivatives, n, we use the method diff(i) on the independent variables, where i denotes the index of the variable i=0...n-1, before evaluating the function.
Using the "_DEBUG" preprocessor symbol
The FADBAD++ code contains debug assertions to catch user errors, such as unpropagated derivatives in the backward method and out of bounds indices. Since assertions are runtime checks that slows down performance, these assertions are not compiled into the code by default. However, while implementing the code it is a good idea to define the "_DEBUG" symbol when compiling code using FADBAD++. The assertions will then be compiled into the code and error messages will be written to the console if any runtime errors are discovered.

Work related to FADBAD++:

Other Automatic differentiation Sites:



This page was last modified by Ole Stauning, Apr 19, 2012.