Optimization

To use those functions, include the file CM_optimize.h in your program and link with -lCheapMatrix -lCM_optimize -lm. All those functions and classes are located in the CheapMatrix namespace.

Framework

A generic framework enables you to create models and optimizers very easily. Please see the example optimize.cpp if you want a quick demonstration.
All is based on 2 abstract classes, Optimizer and OptimizeTarget, containing the necessary functions and variables. All you have to do is inherit from Optimizer and implement your algorithm (see CM_scg.cpp for an actual example) if you want to create an optimizer, or inherit from OptimizeTarget and implement the Error/Gradient function if you want to create a model (see the example optimize.cpp).

Predefined optimizers

class SCG : public Optimizer
Optimize using Scaled Conjugate Gradients and stop after niter iterations.
It returns the final sum of square error (see optimize, from the Optimizer class).
Options:
1. the number of iterations (default = 100)
2. precision on the values (default = 1e-10)
3. precision on the error (default = 1e-10)
The code for the algorithm used here is inspired from a Netlab, a Matlab neural network toolbox written by Ian T Nabney & Christopher M Bishop (1996, 1997, 1998) It is available at http://www.ncrg.aston.ac.uk/netlab/index.html.

class Annealing : public Optimizer
Simulated annealing optimization algorithm. This is my own poor implementation, mostly untested, and certainly not the best you could find on the Internet. Once more, if you wish to contribute...
Options:
1. max iter. Default is 100
2. initial temperature. Default is to try to find one.
3. ratio for updating temperature when a step is taken. Default 0.999.
4. Step size standard deviation at initial temperature. Default is 1 / total dimension.
5. 'Boltzman' constant. Default is 1.0.
Auto-save will save the minimal error configuration, not necessarily the last step taken.

The OptimizeTarget class

It contains two protected Vectors, values and gradients, that you have to set up according to your model. You can set references to part of the values, A = values(5,10).reshape(2,3); for example, those references will be kept valid.
You don't have to use the gradients, but then gradient based optimizers like SCG won't work, of course.

virtual ~OptimizeTarget();
Necessary virtual destructor. Does nothing.

enum EvaluateTarget {
    EVALUATE_ERROR = 1,
    EVALUATE_GRADIENT = 2,
    EVALUATE_BOTH = 3
};
Mask-able enum whose values will be used as an indicator of what to compute in the evaluate function.

virtual MatrixType evaluate(const Matrix& data, const Matrix& target, EvaluateTarget what = EVALUATE_BOTH) = 0;
Pure virtual function that you have to implement in your model. It should evaluate what it is asked, the gradients in the corresponding vector and the error as return value. What is not asked won't be used by the optimizer. As computation of the error function and its gradients often use common terms, there is no need to compute them twice using this approach.

virtual bool save(const string& filename);
Saves the values in the file specified, using the save utility. Returns success state.

virtual bool load(const string& filename);
Loads the values from the file specified, using the load utility. Returns success state.

inline Vector copy_values()
inline Vector copy_gradients()
Make a copy of the values or the gradients. This is really a copy, the protected vectors aren't affected.

Matrix gradcheck(const Matrix& data, const Matrix& target);
Checks the evaluate algorithm implementation using finite differences. The gradients computed by 'evaluate' are stored in the first column, the finite difference estimation in the second, their difference in the third, and the absolute value of the relative error commited in the fourth.
This will enable you to see immediately if your implementation is wrong, but of course is no complete proof that it is correct.
This function is inspired from Netlab, a Matlab neural network toolbox written by Ian T Nabney & Christopher M Bishop (1996, 1997, 1998), available at http://www.ncrg.aston.ac.uk/netlab/index.html.

The Optimizer class

Apart from the algorithm that you have to implement, this class handles the passage of options and the logging facilities. Two protected variables, string autosavefile and ostream* logstream, should be used in your optimizer if you wish it to honor the user preferences. They wil lbe respectively non-empty and not null if the corresponding facility is requested.

virtual MatrixType algorithm(OptimizeTarget& opta, const Matrix& data, const Matrix& target, const Vector& options) = 0;
Inherited class must implement their optimization algorithm here.
Arguments are the object to optimize, and the data and target on which it is optimized. The whole point of this architecture is you can call opta.evaluate on those matrices and have the error at the current values, and/or its gradients, without actually knowing the target itself.
This function is protected, so cannot be called directly by the user (which must use optimize instead). Checks are made there to ensure the algorithm will be called with at least the default options, so you don't have to worry about the option vector (just read it) if you overloaded the default_options function as well.

Vector& values_ref(OptimizeTarget& opta);
Vector& gradients_ref(OptimizeTarget& opta);
Convenience functions to propagate the Optimizer/OptimizeTarget friendship to the children.
Reminder: Never do a shallow copy for the values and gradients. In plain text, values.copy(something) is OK, values = something_else is not. The reason is that there might be references to part of the values that must be kept valid in the target. This is a choice, of either this, or a lot of copies in the optimizer. If you're a bit confused, the best course is probably to look at the predefined optimizers to see how this is done.

inline Optimizer(const Vector& options)
inline Optimizer()
Constructors. It is possible to pass some options at this point.

virtual ~Optimizer();
Necessary virtual destructor. Does nothing.

inline void options(const Vector& opt)
Enables the user to set some options. Defaults will be used if the vector doesn't set all possible options.

MatrixType& option(int i);
Facility to set one option and keep all the defaults. This is the prefered and recommended way to set an option, just call YourOptimizer.option(4) = 2.0; to set the option 4 to 2.0. All other options are unaffected, and were initialized with their default value.

virtual Vector default_options();
Inherited class may implement this as well, to set the number of options this optimizer can use and their default values.

inline void autosave(const string& filename)
If the target should be saved automatically after each success during the training, specify a file (default = no).

inline void log(ostream *out)
You can specify an output stream to log the progress if you wish (default=no log).

MatrixType optimize(OptimizeTarget& opta, const Matrix& data, const Matrix& target, const Vector& options); MatrixType optimize(OptimizeTarget& opta, const Matrix& data, const Matrix& target);
Those are the functions the user can call to run the optimization. They check the algorithm will always be called with at least the number of options returned by default_options(). It will fill with the defaults if necessary.

Back

Table of content

Nicolas Brodu, 2000-06-18