Fitting multiple datasets to "partially" the same model (global fit with shared parameters) Classic List Threaded 10 messages Open this post in threaded view
|

Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 Hello everybody, I am looking for an Octave solution for a task I manage to perform in Origin, i.e. it appears possible from the mathematical point of view. However, I have no idea, how to solve it in Octave, which I desire: I like to fit a model curve to a number of datasets, say data1 = [x,y1] data2 = [x y2] ... the number of datasets is in the order of 10 (i.e. not many more). The model to that data is something like: F(x) = f(x) * g(x), where f(x) depens on some fitting parameters, say a0, a1, a2,... and g(x) depends on b0, b1,... Now I would like to fit all data with the same model simultaneously making use of my knowledge, that the parameters of g(x) (i.e. b0, b1,...) are equal for all the datasets. For example, my model function is: F(x) =  ( a0*ln(x) )* (b0+b1*x) with b0, b1 equal for all datasets. In Origin, such a situation is called "global fit" with "shared parameters" or similiar. Is there any similiar already implemented in Octave, yet? Any help is appreciated.
Open this post in threaded view
|

Re: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 On Mon, Dec 07, 2015 at 06:32:56AM -0800, JokerOne wrote: > Hello everybody, > > I am looking for an Octave solution for a task I manage to perform in > Origin, i.e. it appears possible from the mathematical point of view. > However, I have no idea, how to solve it in Octave, which I desire: > > I like to fit a model curve to a number of datasets, say > > data1 = [x,y1] > data2 = [x y2] > ... > the number of datasets is in the order of 10 (i.e. not many more). > > The model to that data is something like: > > F(x) = f(x) * g(x), where > > f(x) depens on some fitting parameters, say a0, a1, a2,... > and g(x) depends on b0, b1,... > > Now I would like to fit all data with the same model simultaneously making > use of my knowledge, that the parameters of g(x) (i.e. b0, b1,...) are equal > for all the datasets. > > For example, my model function is: > F(x) =  ( a0*ln(x) )* (b0+b1*x) > with b0, b1 equal for all datasets. What do you mean by saying some parameters are "equal for all datasets"? If a parameter is different for a different dataset, it is a different parameter, not the same one. In which way do you want to make use of your knowledge, that ... are "equal for all the datasets"? Do you want on the one hand: a common name for accessing a set of parameters, each applying to a different dataset, and on the other hand: a name for accessing a parameter applying to all datasets? Then there is something in the 'optim' package you could use. Or do you mean that certain parameters affect all datasets, and certain parameters affect only one dataset each? But this is not how you explained it. Or what else? Olaf -- public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net _______________________________________________ Help-octave mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-octave signature.asc (836 bytes) Download Attachment
Open this post in threaded view
|

Re: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 Yes, my explanation was not precise. Indeed, the knowledge I was talking about is about the values of the parameters. So, indeed, the model function(s) for the dataset 1 to n should be noted, in my example, like: F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1 F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2 ... F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n With "unique" a0_i for all datasets and common parameters b1, b2. Now, the fitting procedure could be performed simultaneously, I guess. At least, if this should be of some use? I hope that clarifies my question.
Open this post in threaded view
|

RE: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 > -----Original Message----- > From: help-octave-bounces+allen.windhorn=[hidden email] > > Indeed, the knowledge I was talking about is about the *values* of the > parameters. > > So, indeed, the model function(s) for the dataset 1 to n should be noted, > in my example, like: > > F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1 > F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2 ... > F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n > > With "unique" a0_i for all datasets and common parameters b1, b2. Now, > the fitting procedure could be performed simultaneously, I guess. At > least, if this should be of some use? > > I hope that clarifies my question. If your penalty or error function is the sum of the error functions for all the equations then I think leasqr or another optimization routine can be used without modification -- just glom all the parameters into one big vector and solve.  But the result and the time to solve may not be optimum. I realize your example function may not be realistic, but if I were trying to find coefficients for it I would rearrange it to one that is easier to solve.  If the ln(x) term doesn't vary over a wide range (i.e. including zero) you could define G_i(x) = F_i(x)/ln(x) and p0_i = -1/a0_i and the equation would become p0_i*G_i(x) + b1 + b2(x) = 0, which is a linear equation and can be solved using linear least squares.  The modified problem doesn't have the same solution as the original, but it should be fairly close, maybe needs a final nonlinear step or two to clean it up. Hmmm, quite a three-pipe problem...  Thanks.  Any chance getting a look at a typical dataset? Regards, Allen   _______________________________________________ Help-octave mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-octave
Open this post in threaded view
|

Re: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 In reply to this post by JokerOne On 07.12.2015 16:27, JokerOne wrote: > Yes, my explanation was not precise. > > Indeed, the knowledge I was talking about is about the *values* of the > parameters. > > So, indeed, the model function(s) for the dataset 1 to n should be noted, in > my example, like: > > F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1 > F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2 > ... > F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n > > With "unique" a0_i for all datasets and common parameters b1, b2. Now, the > fitting procedure could be performed simultaneously, I guess. At least, if > this should be of some use? > > I hope that clarifies my question. Maximilian, this functionality is going to be part of the upcoming interval package release, which is planned for December or January. With the interval package you will be able to define a function, which maps from your parameter space into your dataset space, and do a set inversion for your constraints on the observed datasets. This sounds more complicated than it is! You will do the following: function y = F (b1, b2)         A = [a_1; a_2; ...; a_n];         X = [x_1; x_2; ...; x_n];         y = A .* log (X) .* (b1 + b2 .* X); endfunction Y = [y_1; y_2; ...; y_n]; fsolve (@F, infsup ("[-10, 10] [-10, 10]"), Y) Here I assume that you have datasets [x_1, y_1], ... [x_n, y_n], which may either be exact values or ranges of values. Parameters a_1, ..., a_n are assumed to be known and constant values or constant ranges (otherwise you would have to add a to your parameter space, which only contains b at the moment). The parameters b1 and b2 are searched for in the interval [-10, 10]. The return value of fsolve will be an enclosure of possible values for b1 and b2. It is also possible to plot the possible values from the parameter space. For example, the attachement shows a set inversion for the dataset hypot (b1, b2) ∈ [0.5, 2]. The details will be explained in the documentation of the upcoming release. Best regards Oliver _______________________________________________ Help-octave mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-octave fsolve_set_inversion.png (28K) Download Attachment
Open this post in threaded view
|

Re: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 In reply to this post by JokerOne On Mon, Dec 07, 2015 at 07:27:33AM -0800, JokerOne wrote: > Yes, my explanation was not precise. > > Indeed, the knowledge I was talking about is about the *values* of the > parameters. > > So, indeed, the model function(s) for the dataset 1 to n should be noted, in > my example, like: > > F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1 > F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2 > ... > F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n > > With "unique" a0_i for all datasets and common parameters b1, b2. Now, the > fitting procedure could be performed simultaneously, I guess. At least, if > this should be of some use? > > I hope that clarifies my question. If I understand you right, both the a0_i and the b1, b2 are to be determined by fitting, but the a0_i affect only the fit within a specific dataset each. The advantage an algorithm can take from this situation is within the procedure to calculate the Jacobian by finite differencing. There, each parameter separately is changed a bit and the user function is called each time. In your situation it's known that a change in a0_i only affects the calculated values for dataset i, so the values for the other datasets need not be recalculated, which can give a decisive shortening of overall optimization time. There is currently no general optimization algorithm in Octave which does this automatically, by mapping parameters to datasets. (Maybe in your special case you can use the upcoming routine Oliver told you about, I don't know.) But if you use nonlin_residmin or nonlin_curvefit of the optim package, you can implement the desired behavior in your user function. To enable this, these optimizers call the user function with an additional informational argument during calculating the Jacobian by finite differencing. This additional argument (a structure) contains, among others, the positional index of the parameter which is currently changed a bit for calculating the Jacobian. Using this index, your user function can determine if the currently changed parameter belongs to a specific dataset and omit the calculations for all other datasets (i.e. return e.g. a vector of zeros for them). Instead of using the index, you could probably also use named parameters and use the parameter name instead. The details for the above should be available in the documentation of nonlin_residmin (you have to call 'optim_doc' to get this part of the documentation). There has been a similar question in the past, but as yet I havn't found the thread. Olaf -- public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net _______________________________________________ Help-octave mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-octave signature.asc (836 bytes) Download Attachment
Open this post in threaded view
|

Re: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 On 08.12.2015 10:31, Olaf Till wrote: > On Mon, Dec 07, 2015 at 07:27:33AM -0800, JokerOne wrote: >> Yes, my explanation was not precise. >> >> Indeed, the knowledge I was talking about is about the *values* of the >> parameters. >> >> So, indeed, the model function(s) for the dataset 1 to n should be noted, in >> my example, like: >> >> F_1(x) = (a0_1*ln(x)) * (b1 + b2*x) for dataset 1 >> F_2(x) = (a0_2*ln(x)) * (b1 + b2*x) for dataset 2 >> ... >> F_n(x) = (a0_n*ln(x)) * (b1 + b2*x) for dataset n >> >> With "unique" a0_i for all datasets and common parameters b1, b2. Now, the >> fitting procedure could be performed simultaneously, I guess. At least, if >> this should be of some use? >> >> I hope that clarifies my question. > > If I understand you right, both the a0_i and the b1, b2 are to be > determined by fitting, but the a0_i affect only the fit within a > specific dataset each. … > There is currently no general optimization algorithm in Octave which > does this automatically, by mapping parameters to datasets. (Maybe in > your special case you can use the upcoming routine Oliver told you > about, I don't know.) Yes, this is also possible. You could use the following function for set inversion and then operate on a larger parameter space: function y = F (a1, …, aN, b1, b2)         A = [a1; a2; ...; aN];         X = [x_1; x_2; ...; x_n];         y = A .* log (X) .* (b1 + b2 .* X); endfunction However, depending on the size of your parameter space it is wise to get familiar with interval arithmetic and provide a “contractor” to speed up the algorithm. That is, your function will also compute refinements for its parameters. Please find a script attached, which does this kind of search for 5 parameters. The script takes 5 seconds on my notebook. The plotting shows that the parameter space can be separated into b1 < 1 and b1 > 1. The 3D plotting of parameters a1 vs. b1 and b2 looks a little bit strange, maybe my function contains an error, I didn't look so much into detail. _______________________________________________ Help-octave mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-octave screenshot.png (18K) Download Attachment parameter_estimation.m (1K) Download Attachment screenshot-3d.png (89K) Download Attachment signature.asc (484 bytes) Download Attachment
Open this post in threaded view
|

Re: Fitting multiple datasets to "partially" the same model (global fit with shared parameters)

 Thanks everybody for your effort, since I am not too common to the maths in these regression/optimization routines it will take be a while to understand your advises. Allen asks for some actual data. However, there is no "real" data, yet. For my tests I just simulated some data in a simple way . Here, I replaced the logaritmic term with a power term to underline that the logaritmic term in indeed just an example function as Allen assumed correctly. I do not know the function that will suit my final task, yet. I am just checking the basic concept of the fitting routine with some example functions. # Create simulated data for fit using "shared" parameter # Define Parameters to be fitted a0   =    1;   #    offset poly 2nd order a1   =    -3;  #    linear term poly 2nd order a2   =    0.5; #    quadratic poly 2nd order b1   =      1.5; #    power term : base   b2_list   =    [1:0.2:2]; #  list of exponential factors in power term x    =    [1:10];     # Just some x-values #    Create simulation data c1   = 0;        # counter for b2 = b2_list      c1++;     # increment counter      #    HERE: FOR SIMULATTION PURPOSE USE :      #    f(x)      =    b1^(-1*b2*x) * (a0+a1*x+a2*x^2)      y(:,c1)  =    b1.^(-1.*b2.*x).*(a0 + a1.*x +a2.*x.^2);      # Data is sorted in a matrix, could be handled otherwise if reasonable/required... endfor # left for now : Add some noise to the data # y  =    y +0.25.* randn(size(y)); # Simulation data ready #   plot data figure(1) plot(x,y,'o'); #    To solve: #    Find parameters a0, a1, a2 and all parameters of b2_list using #    a regression/fit routine that makes uses of fact, that a0,a1,a2 are #    equal for all colums in y I am very happy for any further advises and help.