An issue with function signatures

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

An issue with function signatures

JuanPi
Hi,

----
TLDR:
optimizer and integration  functions require different functions for
value gradient and hessian. This is too expensive when computing the
value is costly. How to comply with the optimizer/integrator signature
without the extra costs of mutiple evaluations?
----

Many function in octave (e.g. optimizations, like sqp) that optionally
accept gradients, and hessians, do it by accepting different functions
that compute each input argument. e.g. a cell argument in which the
first element is the function, the second element the gradient, and
the third element the hessian.

For many years I liked this separation, but having more experience
with other optimizers. I actually realize that accepting a single
function with multiple output arguments tends to be more numerically
friendly. This is clear when the computation of the function is costly
(eg. likelihood functions of GP) and many of the intermediate
computations can be sued in the gradient and the hessian (in
likelihood functions this is the inverse of the covariance, which is
very expensive!). That is one can compute value, gradient and hessian
in one call to the function.

So far I have not been able to use octave optimizers with gradients
and hessian due to this problem (running time). That is I couldn't yet
find a way to pass three different functions (for the value, the
gradient and the hessian), but internally compute the expensive part
only once.

Note that the problem is induced by the signature of the methods, not
by any property of the function.

Do you have any solution to this problem?

--
JuanPi Carbajal
https://goo.gl/ayiJzi

-----
“An article about computational result is advertising, not
scholarship. The actual scholarship is the full software environment,
code and data, that produced  the  result.”
- Buckheit and Donoho


Reply | Threaded
Open this post in threaded view
|

Re: An issue with function signatures

jbect
Le 19/12/2018 à 15:08, JuanPi a écrit :

> Hi,
>
> ----
> TLDR:
> optimizer and integration  functions require different functions for
> value gradient and hessian. This is too expensive when computing the
> value is costly. How to comply with the optimizer/integrator signature
> without the extra costs of mutiple evaluations?
> ----
>
> Many function in octave (e.g. optimizations, like sqp) that optionally
> accept gradients, and hessians, do it by accepting different functions
> that compute each input argument. e.g. a cell argument in which the
> first element is the function, the second element the gradient, and
> the third element the hessian.
>
> For many years I liked this separation, but having more experience
> with other optimizers. I actually realize that accepting a single
> function with multiple output arguments tends to be more numerically
> friendly. This is clear when the computation of the function is costly
> (eg. likelihood functions of GP) and many of the intermediate
> computations can be sued in the gradient and the hessian (in
> likelihood functions this is the inverse of the covariance, which is
> very expensive!). That is one can compute value, gradient and hessian
> in one call to the function.
>
> So far I have not been able to use octave optimizers with gradients
> and hessian due to this problem (running time). That is I couldn't yet
> find a way to pass three different functions (for the value, the
> gradient and the hessian), but internally compute the expensive part
> only once.
>
> Note that the problem is induced by the signature of the methods, not
> by any property of the function.
>
> Do you have any solution to this problem?


Hi JPi,

What about using persistent variables ?  Something like that :


f = @(p) likfun (p)
df = @(p) likfun_grad (p)

function dL = likfun_grad (p)
[L_ignored, dL] = likfun (p)
endfunction

function [L, dL] = likfun (p)
persistent p0, L0, dL0
if isempty (p0) || ~ isequal (p, p0)
   % compute L and dL
   p0 = p
   L0 = L
   dL0 = dL
else
    L = L0
    dL = dL0
end
endfunction

This is just the general idea, you would probably have more arguments. 
You can add the Hessian similarly.

HTH...

@++
Julien



Reply | Threaded
Open this post in threaded view
|

Re: An issue with function signatures

JuanPi

Hi Julien.
Thanks i have tried this idea, but found that you need some extra code to detect if the point in which the derivatives are evaluated are the same as the ones used to evaluate the function, to make sure that the correspondence between function and derivatives holds. Otherwise you are depending on the internal functioning of the optimiser, which can be anything.  Don't you suffer this problem?



Reply | Threaded
Open this post in threaded view
|

Re: An issue with function signatures

jbect
Le 20/12/2018 à 09:56, JuanPi a écrit :
>
> Hi Julien.
> Thanks i have tried this idea, but found that you need some extra code
> to detect if the point in which the derivatives are evaluated are the
> same as the ones used to evaluate the function, to make sure that the
> correspondence between function and derivatives holds. Otherwise you
> are depending on the internal functioning of the optimiser, which can
> be anything. Don't you suffer this problem?
>

Sure, you need some extra code, but it's not necessarily very complicated.

In my simple example, the extra code is just "~ isequal (p, p0)". If you
have more arguments, you need more isequals.




Reply | Threaded
Open this post in threaded view
|

Re: An issue with function signatures

Juan Pablo Carbajal-2
Hi Julien,

Thank you for your thoughts. Unless we get another answer I see this
as a weakness on the signature and we should discourage it in future
function.
There is this other aspect that marks superiority of the signature
which request a function with multiple arguments (value, deriv,
deriv2, etc..): a function that returns multiple arguments can be used
with optimizers with the current signature (i.e, updating current
signatures can be made backwards compatible) without extra coding or
more expensive execution, while the converse is not true (either one
needs extra spurious code, like the one with the persistent variable
trick, or we waste computing).

Thanks

On Thu, Dec 20, 2018 at 10:35 AM Julien Bect
<[hidden email]> wrote:

>
> Le 20/12/2018 à 09:56, JuanPi a écrit :
> >
> > Hi Julien.
> > Thanks i have tried this idea, but found that you need some extra code
> > to detect if the point in which the derivatives are evaluated are the
> > same as the ones used to evaluate the function, to make sure that the
> > correspondence between function and derivatives holds. Otherwise you
> > are depending on the internal functioning of the optimiser, which can
> > be anything. Don't you suffer this problem?
> >
>
> Sure, you need some extra code, but it's not necessarily very complicated.
>
> In my simple example, the extra code is just "~ isequal (p, p0)". If you
> have more arguments, you need more isequals.
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: An issue with function signatures

jbect
Le 21/12/2018 à 09:16, Juan Pablo Carbajal a écrit :
> Thank you for your thoughts. Unless we get another answer I see this
> as a weakness on the signature and we should discourage it in future
> function.


I agree with you.  It think that is usually better to require that user
should provide *one* objective function :

[f_val, f_grad, f_hessian] = objfun (x)


One further argument in this direction : this is the Matlab-way ; see, e.g.,

https://fr.mathworks.com/help/optim/ug/writing-scalar-objective-functions.html

This is also the NLopt-way :

https://nlopt.readthedocs.io/en/latest/NLopt_Matlab_Reference/


Switching to this kind of signature for sqp (the only nonlinear solver
in Octave core) would mean breaking compatibility with earlier versions
of Octave.  @OctaveMaintainer : what is your position about this ?

In the optim package, as far as I can see, the current situation is a
mix of both signatures.  For instance, bfgs_min uses the all-in-one
signature, while cg_min has a separate df argument. @Olaf : can you
comment on that ?


@++
Julien



Reply | Threaded
Open this post in threaded view
|

Re: An issue with function signatures

Juan Pablo Carbajal-2
Hi,

It is also the bayesopt[1] and gpml[2] way, which I help packaging for OF

[1] https://github.com/rmcantin/bayesopt
[2] http://www.gaussianprocess.org/gpml/code/matlab/doc/