

Hello everybody,
I am looking for an Octave solution for a task I manage to perform in Origin, i.e. it appears possible from the mathematical point of view. However, I have no idea, how to solve it in Octave, which I desire:
I like to fit a model curve to a number of datasets, say
data1 = [x,y1]
data2 = [x y2]
...
the number of datasets is in the order of 10 (i.e. not many more).
The model to that data is something like:
F(x) = f(x) * g(x), where
f(x) depens on some fitting parameters, say a0, a1, a2,...
and g(x) depends on b0, b1,...
Now I would like to fit all data with the same model simultaneously making use of my knowledge, that the parameters of g(x) (i.e. b0, b1,...) are equal for all the datasets.
For example, my model function is:
F(x) = ( a0*ln(x) )* (b0+b1*x)
with b0, b1 equal for all datasets.
In Origin, such a situation is called "global fit" with "shared parameters" or similiar.
Is there any similiar already implemented in Octave, yet?
Any help is appreciated.


On Mon, Dec 07, 2015 at 06:32:56AM 0800, JokerOne wrote:
> Hello everybody,
>
> I am looking for an Octave solution for a task I manage to perform in
> Origin, i.e. it appears possible from the mathematical point of view.
> However, I have no idea, how to solve it in Octave, which I desire:
>
> I like to fit a model curve to a number of datasets, say
>
> data1 = [x,y1]
> data2 = [x y2]
> ...
> the number of datasets is in the order of 10 (i.e. not many more).
>
> The model to that data is something like:
>
> F(x) = f(x) * g(x), where
>
> f(x) depens on some fitting parameters, say a0, a1, a2,...
> and g(x) depends on b0, b1,...
>
> Now I would like to fit all data with the same model simultaneously making
> use of my knowledge, that the parameters of g(x) (i.e. b0, b1,...) are equal
> for all the datasets.
>
> For example, my model function is:
> F(x) = ( a0*ln(x) )* (b0+b1*x)
> with b0, b1 equal for all datasets.
What do you mean by saying some parameters are "equal for all
datasets"? If a parameter is different for a different dataset, it is
a different parameter, not the same one.
In which way do you want to make use of your knowledge, that ... are
"equal for all the datasets"?
Do you want
on the one hand: a common name for accessing a set of parameters, each
applying to a different dataset,
and on the other hand: a name for accessing a parameter applying to
all datasets?
Then there is something in the 'optim' package you could use.
Or do you mean that certain parameters affect all datasets, and
certain parameters affect only one dataset each? But this is not how
you explained it.
Or what else?
Olaf

public key id EAFE0591, e.g. on xhkp://pool.skskeyservers.net
_______________________________________________
Helpoctave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/helpoctave


Yes, my explanation was not precise.
Indeed, the knowledge I was talking about is about the values of the parameters.
So, indeed, the model function(s) for the dataset 1 to n should be noted, in my example, like:
F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1
F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2
...
F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n
With "unique" a0_i for all datasets and common parameters b1, b2. Now, the fitting procedure could be performed simultaneously, I guess. At least, if this should be of some use?
I hope that clarifies my question.


> Original Message
> From: helpoctavebounces+allen.windhorn= [hidden email]
>
> Indeed, the knowledge I was talking about is about the *values* of the
> parameters.
>
> So, indeed, the model function(s) for the dataset 1 to n should be noted,
> in my example, like:
>
> F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1
> F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2 ...
> F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n
>
> With "unique" a0_i for all datasets and common parameters b1, b2. Now,
> the fitting procedure could be performed simultaneously, I guess. At
> least, if this should be of some use?
>
> I hope that clarifies my question.
If your penalty or error function is the sum of the error functions for all
the equations then I think leasqr or another optimization routine can
be used without modification  just glom all the parameters into one
big vector and solve. But the result and the time to solve may not be
optimum.
I realize your example function may not be realistic, but if I were trying
to find coefficients for it I would rearrange it to one that is easier to
solve. If the ln(x) term doesn't vary over a wide range (i.e. including
zero) you could define G_i(x) = F_i(x)/ln(x) and p0_i = 1/a0_i and
the equation would become p0_i*G_i(x) + b1 + b2(x) = 0, which is
a linear equation and can be solved using linear least squares. The
modified problem doesn't have the same solution as the original,
but it should be fairly close, maybe needs a final nonlinear step or two
to clean it up.
Hmmm, quite a threepipe problem... Thanks. Any chance getting a
look at a typical dataset?
Regards,
Allen
_______________________________________________
Helpoctave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/helpoctave


On 07.12.2015 16:27, JokerOne wrote:
> Yes, my explanation was not precise.
>
> Indeed, the knowledge I was talking about is about the *values* of the
> parameters.
>
> So, indeed, the model function(s) for the dataset 1 to n should be noted, in
> my example, like:
>
> F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1
> F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2
> ...
> F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n
>
> With "unique" a0_i for all datasets and common parameters b1, b2. Now, the
> fitting procedure could be performed simultaneously, I guess. At least, if
> this should be of some use?
>
> I hope that clarifies my question.
Maximilian,
this functionality is going to be part of the upcoming interval package
release, which is planned for December or January.
With the interval package you will be able to define a function, which
maps from your parameter space into your dataset space, and do a set
inversion for your constraints on the observed datasets. This sounds
more complicated than it is!
You will do the following:
function y = F (b1, b2)
A = [a_1; a_2; ...; a_n];
X = [x_1; x_2; ...; x_n];
y = A .* log (X) .* (b1 + b2 .* X);
endfunction
Y = [y_1; y_2; ...; y_n];
fsolve (@F, infsup ("[10, 10] [10, 10]"), Y)
Here I assume that you have datasets [x_1, y_1], ... [x_n, y_n], which
may either be exact values or ranges of values. Parameters a_1, ..., a_n
are assumed to be known and constant values or constant ranges
(otherwise you would have to add a to your parameter space, which only
contains b at the moment). The parameters b1 and b2 are searched for in
the interval [10, 10]. The return value of fsolve will be an enclosure
of possible values for b1 and b2.
It is also possible to plot the possible values from the parameter
space. For example, the attachement shows a set inversion for the
dataset hypot (b1, b2) ∈ [0.5, 2].
The details will be explained in the documentation of the upcoming release.
Best regards
Oliver
_______________________________________________
Helpoctave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/helpoctave


On Mon, Dec 07, 2015 at 07:27:33AM 0800, JokerOne wrote:
> Yes, my explanation was not precise.
>
> Indeed, the knowledge I was talking about is about the *values* of the
> parameters.
>
> So, indeed, the model function(s) for the dataset 1 to n should be noted, in
> my example, like:
>
> F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1
> F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2
> ...
> F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n
>
> With "unique" a0_i for all datasets and common parameters b1, b2. Now, the
> fitting procedure could be performed simultaneously, I guess. At least, if
> this should be of some use?
>
> I hope that clarifies my question.
If I understand you right, both the a0_i and the b1, b2 are to be
determined by fitting, but the a0_i affect only the fit within a
specific dataset each.
The advantage an algorithm can take from this situation is within the
procedure to calculate the Jacobian by finite differencing. There,
each parameter separately is changed a bit and the user function is
called each time. In your situation it's known that a change in a0_i
only affects the calculated values for dataset i, so the values for
the other datasets need not be recalculated, which can give a decisive
shortening of overall optimization time.
There is currently no general optimization algorithm in Octave which
does this automatically, by mapping parameters to datasets. (Maybe in
your special case you can use the upcoming routine Oliver told you
about, I don't know.) But if you use nonlin_residmin or
nonlin_curvefit of the optim package, you can implement the desired
behavior in your user function. To enable this, these optimizers call
the user function with an additional informational argument during
calculating the Jacobian by finite differencing. This additional
argument (a structure) contains, among others, the positional index of
the parameter which is currently changed a bit for calculating the
Jacobian. Using this index, your user function can determine if the
currently changed parameter belongs to a specific dataset and omit the
calculations for all other datasets (i.e. return e.g. a vector of
zeros for them).
Instead of using the index, you could probably also use named
parameters and use the parameter name instead.
The details for the above should be available in the documentation of
nonlin_residmin (you have to call 'optim_doc' to get this part of the
documentation).
There has been a similar question in the past, but as yet I havn't
found the thread.
Olaf

public key id EAFE0591, e.g. on xhkp://pool.skskeyservers.net
_______________________________________________
Helpoctave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/helpoctave


On 08.12.2015 10:31, Olaf Till wrote:
> On Mon, Dec 07, 2015 at 07:27:33AM 0800, JokerOne wrote:
>> Yes, my explanation was not precise.
>>
>> Indeed, the knowledge I was talking about is about the *values* of the
>> parameters.
>>
>> So, indeed, the model function(s) for the dataset 1 to n should be noted, in
>> my example, like:
>>
>> F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1
>> F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2
>> ...
>> F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n
>>
>> With "unique" a0_i for all datasets and common parameters b1, b2. Now, the
>> fitting procedure could be performed simultaneously, I guess. At least, if
>> this should be of some use?
>>
>> I hope that clarifies my question.
>
> If I understand you right, both the a0_i and the b1, b2 are to be
> determined by fitting, but the a0_i affect only the fit within a
> specific dataset each.
…
> There is currently no general optimization algorithm in Octave which
> does this automatically, by mapping parameters to datasets. (Maybe in
> your special case you can use the upcoming routine Oliver told you
> about, I don't know.)
Yes, this is also possible. You could use the following function for set
inversion and then operate on a larger parameter space:
function y = F (a1, …, aN, b1, b2)
A = [a1; a2; ...; aN];
X = [x_1; x_2; ...; x_n];
y = A .* log (X) .* (b1 + b2 .* X);
endfunction
However, depending on the size of your parameter space it is wise to get
familiar with interval arithmetic and provide a “contractor” to speed up
the algorithm. That is, your function will also compute refinements for
its parameters. Please find a script attached, which does this kind of
search for 5 parameters. The script takes 5 seconds on my notebook. The
plotting shows that the parameter space can be separated into b1 < 1 and
b1 > 1.
The 3D plotting of parameters a1 vs. b1 and b2 looks a little bit
strange, maybe my function contains an error, I didn't look so much into
detail.
_______________________________________________
Helpoctave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/helpoctave


Thanks everybody for your effort,
since I am not too common to the maths in these regression/optimization routines it will take be a while to understand your advises.
Allen asks for some actual data. However, there is no "real" data, yet. For my tests I just simulated some data in a simple way .
Here, I replaced the logaritmic term with a power term to underline that the logaritmic term in indeed just an example function as Allen assumed correctly. I do not know the function that will suit my final task, yet. I am just checking the basic concept of the fitting routine with some example functions.
# Create simulated data for fit using "shared" parameter
# Define Parameters to be fitted
a0 = 1; # offset poly 2nd order
a1 = 3; # linear term poly 2nd order
a2 = 0.5; # quadratic poly 2nd order
b1 = 1.5; # power term : base
b2_list = [1:0.2:2]; # list of exponential factors in power term
x = [1:10]; # Just some xvalues
# Create simulation data
c1 = 0; # counter
for b2 = b2_list
c1++; # increment counter
# HERE: FOR SIMULATTION PURPOSE USE :
# f(x) = b1^(1*b2*x) * (a0+a1*x+a2*x^2)
y(:,c1) = b1.^(1.*b2.*x).*(a0 + a1.*x +a2.*x.^2);
# Data is sorted in a matrix, could be handled otherwise if reasonable/required...
endfor
# left for now : Add some noise to the data
# y = y +0.25.* randn(size(y));
# Simulation data ready
# plot data
figure(1)
plot(x,y,'o');
# To solve:
# Find parameters a0, a1, a2 and all parameters of b2_list using
# a regression/fit routine that makes uses of fact, that a0,a1,a2 are
# equal for all colums in y
I am very happy for any further advises and help.


On 08.12.2015 12:51, JokerOne wrote:
> Thanks everybody for your effort,
>
> since I am not too common to the maths in these regression/optimization
> routines it will take be a while to understand your advises.
>
> Allen asks for some actual data. However, there is no "real" data, yet. For
> my tests I just simulated some data in a simple way .
>
> Here, I replaced the logaritmic term with a power term to underline that the
> logaritmic term in indeed just an example function as Allen assumed
> correctly. I do not know the function that will suit my final task, yet. I
> am just checking the basic concept of the fitting routine with some example
> functions.
>
…
> # HERE: FOR SIMULATTION PURPOSE USE :
> # f(x) = b1^(1*b2*x) * (a0+a1*x+a2*x^2)
…
> # To solve:
> # Find parameters a0, a1, a2 and all parameters of b2_list using
> # a regression/fit routine that makes uses of fact, that a0,a1,a2 are
> # equal for all colums in y
>
> I am very happy for any further advises and help.
Hi Maximilian,
I have used your code and done the fitting with the latest release of
the interval package. See
http://cursosing.net/octavers/gnuoctave/listing/580intervalsharedparameterestimationEstimated parameters:
a0 ⊂ [1, 1.0648]
a1 ⊂ [3.0688, 3]
a2 ⊂ [0.5, 0.51124]
b1 ⊂ [1.5, 1.5042]
b2_list =
{
[1,1] ⊂ [1, 1.008]
[1,2] ⊂ [1.1935, 1.2064]
[1,3] ⊂ [1.3935, 1.4064]
[1,4] ⊂ [1.5928, 1.6063]
[1,5] ⊂ [1.7926, 1.8062]
[1,6] ⊂ [1.9902, 2]
}
Apparently the estimation works quite well (as long as you don't add to
much uncertainty to the observed values of y).
Best regards
Oliver
_______________________________________________
Helpoctave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/helpoctave


Hello,
just wanted to let you know, that this thread is still interesting for me. I will check your advises as soon as I will find time. Thank you anyway.

