octave dev slow down

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

octave dev slow down

Dmitri A. Sergatskov
I run some benchmarks with the recent dev vs 4.2.1 and noticed a significant slowdown
for the following test:

a=randn(4000);
tic; a'*a; toc
For both runs I LD_PRELOAD=/usr/lib64/atlas/libtatlas.so
and then repeat with LD_PRELOAD=/usr/lib64/atlas/libsatlas.so
4.2.1:

Elapsed time is 1.61557 seconds. (t)
Elapsed time is 5.02009 seconds. (s)

hg id
ed2239ed5fd3 tip @

Elapsed time is 2.78834 seconds. (t)
Elapsed time is 9.69433 seconds. (s)

Most of other benchmarks (like inv(a)*a, a\b are pretty much the same).

Dmitri.
--


Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Mike Miller-4
On Mon, May 22, 2017 at 23:48:10 -0500, Dmitri A. Sergatskov wrote:

> I run some benchmarks with the recent dev vs 4.2.1 and noticed a
> significant slowdown
> for the following test:
>
> a=randn(4000);
> tic; a'*a; toc
> For both runs I LD_PRELOAD=/usr/lib64/atlas/libtatlas.so
> and then repeat with LD_PRELOAD=/usr/lib64/atlas/libsatlas.so
> 4.2.1:
>
> Elapsed time is 1.61557 seconds. (t)
> Elapsed time is 5.02009 seconds. (s)
>
> hg id
> ed2239ed5fd3 tip @
>
> Elapsed time is 2.78834 seconds. (t)
> Elapsed time is 9.69433 seconds. (s)
>
> Most of other benchmarks (like inv(a)*a, a\b are pretty much the same).

Confirmed here. I bisected and found a lot of performance loss starting
with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
same performance as 4.2.1. If you can compare those two revisions and
confirm, that's a good place to start looking for a cause.

--
mike

Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

John W. Eaton
Administrator
On 05/23/2017 04:37 AM, Mike Miller wrote:

> Confirmed here. I bisected and found a lot of performance loss starting
> with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
> same performance as 4.2.1. If you can compare those two revisions and
> confirm, that's a good place to start looking for a cause.

What was your test for performance here?

I recall timing "make check" when I made those changes and did not see a
significant change in performance.

If I have something to test, I'll take a look at it.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Mike Miller-4
On Tue, May 23, 2017 at 12:16:07 -0400, John W. Eaton wrote:

> On 05/23/2017 04:37 AM, Mike Miller wrote:
>
> > Confirmed here. I bisected and found a lot of performance loss starting
> > with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
> > same performance as 4.2.1. If you can compare those two revisions and
> > confirm, that's a good place to start looking for a cause.
>
> What was your test for performance here?
>
> I recall timing "make check" when I made those changes and did not see a
> significant change in performance.
>
> If I have something to test, I'll take a look at it.

I ran Dmitri's test case a handful of times at each build revision. I
get a distinct difference between f4d4d83f15c5 and c452180ab672, all
other things being equal. I'm using OpenBLAS instead of ATLAS.

I ran multiple Octave sessions with -cli -W, built without Qt to speed
up bisecting, using the test case "x = rand(4000); tic; x'*x; toc".

f4d4d83f15c5: mean is 0.63071 seconds, std dev is 0.0024187.

c452180ab672: mean is 1.1713 seconds, std dev is 0.11803.

This is the test case that I used to bisect and the results stayed
consistent and converged on this revision.

--
mike

Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

John W. Eaton
Administrator
On 05/23/2017 01:12 PM, Mike Miller wrote:

> On Tue, May 23, 2017 at 12:16:07 -0400, John W. Eaton wrote:
>> On 05/23/2017 04:37 AM, Mike Miller wrote:
>>
>>> Confirmed here. I bisected and found a lot of performance loss starting
>>> with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
>>> same performance as 4.2.1. If you can compare those two revisions and
>>> confirm, that's a good place to start looking for a cause.
>>
>> What was your test for performance here?
>>
>> I recall timing "make check" when I made those changes and did not see a
>> significant change in performance.
>>
>> If I have something to test, I'll take a look at it.
>
> I ran Dmitri's test case a handful of times at each build revision. I
> get a distinct difference between f4d4d83f15c5 and c452180ab672, all
> other things being equal. I'm using OpenBLAS instead of ATLAS.
>
> I ran multiple Octave sessions with -cli -W, built without Qt to speed
> up bisecting, using the test case "x = rand(4000); tic; x'*x; toc".
>
> f4d4d83f15c5: mean is 0.63071 seconds, std dev is 0.0024187.
>
> c452180ab672: mean is 1.1713 seconds, std dev is 0.11803.
>
> This is the test case that I used to bisect and the results stayed
> consistent and converged on this revision.

Thanks, it should be fixed now with the latest two changesets that I pushed.

The implementation of the compound binary expression object is a bit
tricky and I made a mistake when I translated the rvalue1 operation to a
tree_evaluator::visit* function.

I'm sure the reason that I didn't see anything significant in my tests
was that I only looked at the overall performance of running the test
suite, not any one operation individually.  I wasn't expecting much
difference in performance in each evaluation step.  I was more concerned
with whether using stack objects to hold function results would perform
worse than returning values from the rvalue functions.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Michael Godfrey


On 05/23/2017 07:42 PM, John W. Eaton wrote:
On 05/23/2017 01:12 PM, Mike Miller wrote:
On Tue, May 23, 2017 at 12:16:07 -0400, John W. Eaton wrote:
On 05/23/2017 04:37 AM, Mike Miller wrote:

Confirmed here. I bisected and found a lot of performance loss starting
with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
same performance as 4.2.1. If you can compare those two revisions and
confirm, that's a good place to start looking for a cause.

What was your test for performance here?

I recall timing "make check" when I made those changes and did not see a
significant change in performance.

If I have something to test, I'll take a look at it.

I ran Dmitri's test case a handful of times at each build revision. I
get a distinct difference between f4d4d83f15c5 and c452180ab672, all
other things being equal. I'm using OpenBLAS instead of ATLAS.

I ran multiple Octave sessions with -cli -W, built without Qt to speed
up bisecting, using the test case "x = rand(4000); tic; x'*x; toc".

f4d4d83f15c5: mean is 0.63071 seconds, std dev is 0.0024187.

c452180ab672: mean is 1.1713 seconds, std dev is 0.11803.

This is the test case that I used to bisect and the results stayed
consistent and converged on this revision.

Thanks, it should be fixed now with the latest two changesets that I pushed.

The implementation of the compound binary expression object is a bit tricky and I made a mistake when I translated the rvalue1 operation to a tree_evaluator::visit* function.

I'm sure the reason that I didn't see anything significant in my tests was that I only looked at the overall performance of running the test suite, not any one operation individually.  I wasn't expecting much difference in performance in each evaluation step.  I was more concerned with whether using stack objects to hold function results would perform worse than returning values from the rvalue functions.

jwe

I have done some comparisons between 4.0.3 and the current dev be69ea3de7a3 tip @ (also some previous devs)
and typically I see:

4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

Initially, I was checking Rik's conversion of the elementary functions to C++ std (which seem to be all
alright) but I noticed the large timing difference.  The code that I used spends most of its time transforming
complex-valued arrays using exp(), atanh(), etc. Since I ran some tests prior to Rik's new code, it appears
that the cause is not the new std functions.

Michael Godfrey

Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Rik-4
On 06/19/2017 07:37 AM, Michael D Godfrey wrote:


On 05/23/2017 07:42 PM, John W. Eaton wrote:
On 05/23/2017 01:12 PM, Mike Miller wrote:
On Tue, May 23, 2017 at 12:16:07 -0400, John W. Eaton wrote:
On 05/23/2017 04:37 AM, Mike Miller wrote:

Confirmed here. I bisected and found a lot of performance loss starting
with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
same performance as 4.2.1. If you can compare those two revisions and
confirm, that's a good place to start looking for a cause.

What was your test for performance here?

I recall timing "make check" when I made those changes and did not see a
significant change in performance.

If I have something to test, I'll take a look at it.

I ran Dmitri's test case a handful of times at each build revision. I
get a distinct difference between f4d4d83f15c5 and c452180ab672, all
other things being equal. I'm using OpenBLAS instead of ATLAS.

I ran multiple Octave sessions with -cli -W, built without Qt to speed
up bisecting, using the test case "x = rand(4000); tic; x'*x; toc".

f4d4d83f15c5: mean is 0.63071 seconds, std dev is 0.0024187.

c452180ab672: mean is 1.1713 seconds, std dev is 0.11803.

This is the test case that I used to bisect and the results stayed
consistent and converged on this revision.

Thanks, it should be fixed now with the latest two changesets that I pushed.

The implementation of the compound binary expression object is a bit tricky and I made a mistake when I translated the rvalue1 operation to a tree_evaluator::visit* function.

I'm sure the reason that I didn't see anything significant in my tests was that I only looked at the overall performance of running the test suite, not any one operation individually.  I wasn't expecting much difference in performance in each evaluation step.  I was more concerned with whether using stack objects to hold function results would perform worse than returning values from the rvalue functions.

jwe

I have done some comparisons between 4.0.3 and the current dev be69ea3de7a3 tip @ (also some previous devs)
and typically I see:

4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

Initially, I was checking Rik's conversion of the elementary functions to C++ std (which seem to be all
alright) but I noticed the large timing difference.  The code that I used spends most of its time transforming
complex-valued arrays using exp(), atanh(), etc. Since I ran some tests prior to Rik's new code, it appears
that the cause is not the new std functions.

Michael,

Thanks for noticing this.  If the issue is a slow down in complex-valued arrays then maybe you can re-test in about a week?  At the moment I am converting many of the basic mapper functions which used to dispatch to gnulib, Fortran, or even our own hand-rolled C++ code, to instead dispatch to the C++ standard library.  Besides making the code simpler, and reducing our external dependencies during configure, Octave will now sit squarely atop the standard library which is a well-debugged and well-coded piece of software.

My next task, after the basic functions, is to look at how the mapper functions are implemented for complex values.  Currently, we often hand code our own functions for complex values.  However, std::complex already includes templates for some of the basic math functions.  I would like to switch over to using the standard templates whenever possible which might improve performance.

--Rik

Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Michael Godfrey


On 06/19/2017 04:32 PM, Rik wrote:
On 06/19/2017 07:37 AM, Michael D Godfrey wrote:


On 05/23/2017 07:42 PM, John W. Eaton wrote:
On 05/23/2017 01:12 PM, Mike Miller wrote:
On Tue, May 23, 2017 at 12:16:07 -0400, John W. Eaton wrote:
On 05/23/2017 04:37 AM, Mike Miller wrote:

Confirmed here. I bisected and found a lot of performance loss starting
with c452180ab672. Its immediate predecessor f4d4d83f15c5 has about the
same performance as 4.2.1. If you can compare those two revisions and
confirm, that's a good place to start looking for a cause.

What was your test for performance here?

I recall timing "make check" when I made those changes and did not see a
significant change in performance.

If I have something to test, I'll take a look at it.

I ran Dmitri's test case a handful of times at each build revision. I
get a distinct difference between f4d4d83f15c5 and c452180ab672, all
other things being equal. I'm using OpenBLAS instead of ATLAS.

I ran multiple Octave sessions with -cli -W, built without Qt to speed
up bisecting, using the test case "x = rand(4000); tic; x'*x; toc".

f4d4d83f15c5: mean is 0.63071 seconds, std dev is 0.0024187.

c452180ab672: mean is 1.1713 seconds, std dev is 0.11803.

This is the test case that I used to bisect and the results stayed
consistent and converged on this revision.

Thanks, it should be fixed now with the latest two changesets that I pushed.

The implementation of the compound binary expression object is a bit tricky and I made a mistake when I translated the rvalue1 operation to a tree_evaluator::visit* function.

I'm sure the reason that I didn't see anything significant in my tests was that I only looked at the overall performance of running the test suite, not any one operation individually.  I wasn't expecting much difference in performance in each evaluation step.  I was more concerned with whether using stack objects to hold function results would perform worse than returning values from the rvalue functions.

jwe

I have done some comparisons between 4.0.3 and the current dev be69ea3de7a3 tip @ (also some previous devs)
and typically I see:

4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

Initially, I was checking Rik's conversion of the elementary functions to C++ std (which seem to be all
alright) but I noticed the large timing difference.  The code that I used spends most of its time transforming
complex-valued arrays using exp(), atanh(), etc. Since I ran some tests prior to Rik's new code, it appears
that the cause is not the new std functions.

Michael,

Thanks for noticing this.  If the issue is a slow down in complex-valued arrays then maybe you can re-test in about a week?  At the moment I am converting many of the basic mapper functions which used to dispatch to gnulib, Fortran, or even our own hand-rolled C++ code, to instead dispatch to the C++ standard library.  Besides making the code simpler, and reducing our external dependencies during configure, Octave will now sit squarely atop the standard library which is a well-debugged and well-coded piece of software.

My next task, after the basic functions, is to look at how the mapper functions are implemented for complex values.  Currently, we often hand code our own functions for complex values.  However, std::complex already includes templates for some of the basic math functions.  I would like to switch over to using the standard templates whenever possible which might improve performance.

--Rik
Rik,

I did not fully appreciate how much work you were doing!  But, keep in mind that the loss of performance seems to
have occurred before you started. Maybe what you are doing will recover some.  In any case, I will run the same tests
as soon as you are done.

Thanks!
Michael
Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

John W. Eaton
Administrator
On 06/19/2017 01:24 PM, Michael D Godfrey wrote:
> On 06/19/2017 04:32 PM, Rik wrote:
>> On 06/19/2017 07:37 AM, Michael D Godfrey wrote:

>>> I have done some comparisons between 4.0.3 and the current dev
>>> be69ea3de7a3 tip @ (also some previous devs)
>>> and typically I see:
>>>
>>> 4.3.0+
>>> test 2: cputime used: 9.2e-01 seconds
>>>
>>> 4.0.3   /usr/bin/octave --no-gui
>>> test 2: cputime used: 6.4e-01 seconds
>>>
>>> Initially, I was checking Rik's conversion of the elementary
>>> functions to C++ std (which seem to be all
>>> alright) but I noticed the large timing difference.  The code that I
>>> used spends most of its time transforming
>>> complex-valued arrays using exp(), atanh(), etc. Since I ran some
>>> tests prior to Rik's new code, it appears
>>> that the cause is not the new std functions.

Can you share the code you use for testing?

My intent is not to make Octave slower.  However, I can tolerate a
little (hopefully temporary) decrease in performance in exchange for
code that is clearer and easier to maintain or that is less likely to
lead to memory leaks.  That's my primary goal right now.

>> Thanks for noticing this.  If the issue is a slow down in
>> complex-valued arrays then maybe you can re-test in about a week?  At
>> the moment I am converting many of the basic mapper functions which
>> used to dispatch to gnulib, Fortran, or even our own hand-rolled C++
>> code, to instead dispatch to the C++ standard library.  Besides making
>> the code simpler, and reducing our external dependencies during
>> configure, Octave will now sit squarely atop the standard library
>> which is a well-debugged and well-coded piece of software.
>>
>> My next task, after the basic functions, is to look at how the mapper
>> functions are implemented for complex values.  Currently, we often
>> hand code our own functions for complex values.  However, std::complex
>> already includes templates for some of the basic math functions.  I
>> would like to switch over to using the standard templates whenever
>> possible which might improve performance.

Could it also improve performance to use templates differently so that
we can avoid passing function pointers?  I'm thinking that using
functions as template parameters allows inlining where passing function
pointers as parameters to a mapping function does not.  But I'm not sure
whether that's correct.  We currently have a mixture of these styles.  I
guess it would be good to figure out which is better and be consistent
if possible.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Michael Godfrey


On 06/19/2017 11:18 PM, John W. Eaton wrote:
On 06/19/2017 01:24 PM, Michael D Godfrey wrote:
On 06/19/2017 04:32 PM, Rik wrote:
On 06/19/2017 07:37 AM, Michael D Godfrey wrote:

I have done some comparisons between 4.0.3 and the current dev
be69ea3de7a3 tip @ (also some previous devs)
and typically I see:

4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

Initially, I was checking Rik's conversion of the elementary
functions to C++ std (which seem to be all
alright) but I noticed the large timing difference.  The code that I
used spends most of its time transforming
complex-valued arrays using exp(), atanh(), etc. Since I ran some
tests prior to Rik's new code, it appears
that the cause is not the new std functions.

Can you share the code you use for testing?

My intent is not to make Octave slower.  However, I can tolerate a little (hopefully temporary) decrease in performance in exchange for code that is clearer and easier to maintain or that is less likely to lead to memory leaks.  That's my primary goal right now.

Thanks for noticing this.  If the issue is a slow down in
complex-valued arrays then maybe you can re-test in about a week?  At
the moment I am converting many of the basic mapper functions which
used to dispatch to gnulib, Fortran, or even our own hand-rolled C++
code, to instead dispatch to the C++ standard library.  Besides making
the code simpler, and reducing our external dependencies during
configure, Octave will now sit squarely atop the standard library
which is a well-debugged and well-coded piece of software.

My next task, after the basic functions, is to look at how the mapper
functions are implemented for complex values.  Currently, we often
hand code our own functions for complex values.  However, std::complex
already includes templates for some of the basic math functions.  I
would like to switch over to using the standard templates whenever
possible which might improve performance.

Could it also improve performance to use templates differently so that we can avoid passing function pointers?  I'm thinking that using functions as template parameters allows inlining where passing function pointers as parameters to a mapping function does not.  But I'm not sure whether that's correct.  We currently have a mixture of these styles.  I guess it would be good to figure out which is better and be consistent if possible.

jwe
John,

I understand, and of course strongly support, the work that you and Rik are doing right now. This will surely
make Octave better and more easily maintained. I just felt that I should remind you of the performance issue
which has been mentioned here a while ago already. Currently the code that I have been using is a segment
of a fairly large program. I have just planted cputime() calls at convenient points. I will look at extracting a block
of code as a standalone if that looks like it is really be needed. Right now it appears that any code that
applies operators like exp() or tanh() to fairly long and mostly complex vectors will show the slow down that
I have observed. But, I may be doing something that is inefficient for some specific reason.

Michael
Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Rik-4
In reply to this post by John W. Eaton
On 06/19/2017 03:18 PM, John W. Eaton wrote:
>
> Could it also improve performance to use templates differently so that we
> can avoid passing function pointers?  I'm thinking that using functions
> as template parameters allows inlining where passing function pointers as
> parameters to a mapping function does not.  But I'm not sure whether
> that's correct.  We currently have a mixture of these styles.  I guess it
> would be good to figure out which is better and be consistent if possible.

I haven't figured out a good way to do profiling on Octave, and I think
that should be the first step.  I have tried to optimize programs without
doing measurements first, and the root cause invariably turns out *not* to
be what you thought it was.

My current speculation--this is without confirmation--is that there are a
lot of temporaries being created whenever any object is created.  For
example, calling the constructor for an array should generally be done with
a dim_vector object.  The dim_vector exists just long enough for the array
constructor to be called and then is destroyed.  If we don't have
lightweight constructors and destructors for frequently used objects like
this then there will be a performance problem.

Another idea I have is that we might start to use move constructors for
some objects.  This would allow transfer of ownership without calling the
copy constructor and might alleviate the problems with heavyweight
constructors (if indeed, they are an actual problem).

--Rik


Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

Michael Godfrey


On 06/20/2017 12:42 AM, Rik wrote:
On 06/19/2017 03:18 PM, John W. Eaton wrote:
Could it also improve performance to use templates differently so that we
can avoid passing function pointers?  I'm thinking that using functions
as template parameters allows inlining where passing function pointers as
parameters to a mapping function does not.  But I'm not sure whether
that's correct.  We currently have a mixture of these styles.  I guess it
would be good to figure out which is better and be consistent if possible.
I haven't figured out a good way to do profiling on Octave, and I think
that should be the first step.  I have tried to optimize programs without
doing measurements first, and the root cause invariably turns out *not* to
be what you thought it was.

My current speculation--this is without confirmation--is that there are a
lot of temporaries being created whenever any object is created.  For
example, calling the constructor for an array should generally be done with
a dim_vector object.  The dim_vector exists just long enough for the array
constructor to be called and then is destroyed.  If we don't have
lightweight constructors and destructors for frequently used objects like
this then there will be a performance problem.

Another idea I have is that we might start to use move constructors for
some objects.  This would allow transfer of ownership without calling the
copy constructor and might alleviate the problems with heavyweight
constructors (if indeed, they are an actual problem).

--Rik


Rik,

Just a note: I did some cleanup on the code sections that I have been timing:
Old code:
4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

New code:
4.3.0+
test 2: cputime used: 5.1395e-01 seconds

4.0.3
test 2: cputime used: 3.4943e-01 seconds

Moral of this is that it pays to spend time on code cleanup.
This more than offset the 4.3.0+ slowdown.
Of course, 4.3.0+ is still slower by about the same amount.

Michael

Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

jbect
Michael Godfrey wrote
Just a note: I did some cleanup on the code sections that I have been
timing:
Old code:
4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

New code:
4.3.0+
test 2: cputime used: 5.1395e-01 seconds

4.0.3
test 2: cputime used: 3.4943e-01 seconds

Moral of this is that it pays to spend time on code cleanup.
This more than offset the 4.3.0+ slowdown.
Of course, 4.3.0+ is still slower by about the same amount.

Michael
Hi everyone,

I also notice a similar a slow-down when benchmarking the stk_predict function in the stk package.

Here are the median times in ms (the first two digits should be significant, sorry about the others):

|             :  med_R2016  med_Oct403  med_Oct421  med_Oct430
|     1 /   1 :   0.614500     4.76944     4.66347     6.59025
|    10 /   1 :   0.780500     4.45724     4.36750     6.11651
|   100 /   1 :   1.554438     7.59995     7.41750    10.15353
|     1 /  10 :   0.761500     4.46904     4.37081     6.12175
|    10 /  10 :   0.779438     6.72776     6.62458     9.55874
|   100 /  10 :   1.512875     7.74771     7.55823    10.30993
|     1 / 100 :   0.580828     4.53871     4.40472     6.18076
|    10 / 100 :   0.703375     6.48999     6.31845     9.03147
|   100 / 100 :   1.917000     9.03255     8.49599    11.50894
|    1 / 1000 :   0.758219     4.59623     4.56923     6.31821
|   10 / 1000 :   1.198875     7.10303     7.03001     9.74554
|  100 / 1000 :   5.823250    16.21902    15.56194    16.60883


About the column names:
 * Oct403 is the 4.0.3 release from Debian repos
 * Oct421 is compiled from source from the tip of stable
 * Oct430 is compiled from source from the tip of default
 * (the first column... well, you can probably guess)

About the row names: correspond to various configurations for the size of the input arguments.

Basically: when these numbers are large, we essentially benchmarking linear algebra operations (I think).

The results can be reproduced as follows

hg clone -r 2.5.x http://hg.code.sf.net/p/kriging/hg stk-test
cd stk-test
octave &

stk_init
pkg load statistics
cd misc/benchmarks
stk_benchmark_predict

@++
Julien
Reply | Threaded
Open this post in threaded view
|

Re: octave dev slow down

jbect
Le 12/07/2017 à 11:35, jbect a écrit :
Michael Godfrey wrote
Just a note: I did some cleanup on the code sections that I have been 
timing:
Old code:
4.3.0+
test 2: cputime used: 9.2e-01 seconds

4.0.3   /usr/bin/octave --no-gui
test 2: cputime used: 6.4e-01 seconds

New code:
4.3.0+
test 2: cputime used: 5.1395e-01 seconds

4.0.3
test 2: cputime used: 3.4943e-01 seconds

Moral of this is that it pays to spend time on code cleanup.
This more than offset the 4.3.0+ slowdown.
Of course, 4.3.0+ is still slower by about the same amount.

Michael
Hi everyone,

I also notice a similar a slow-down when benchmarking the stk_predict
function in the stk package.

Here are the median times in ms (the first two digits should be significant,
sorry about the others):




About the column names:
 * Oct403 is the 4.0.3 release from Debian repos
 * Oct421 is compiled from source from the tip of stable
 * Oct430 is compiled from source from the tip of default
 * (the first column... well, you can probably guess)

About the row names: correspond to various configurations for the size of
the input arguments.

Basically: when these numbers are large, we essentially benchmarking linear
algebra operations (I think).

The results can be reproduced as follows

hg clone -r 2.5.x http://hg.code.sf.net/p/kriging/hg stk-test
cd stk-test
octave &

stk_init
pkg load statistics
cd misc/benchmarks
stk_benchmark_predict

@++
Julien

This is weird, the table disappeared from my first email.  Here it is:

|             :  med_R2016  med_Oct403  med_Oct421  med_Oct430
|     1 /   1 :   0.614500     4.76944     4.66347     6.59025
|    10 /   1 :   0.780500     4.45724     4.36750     6.11651
|   100 /   1 :   1.554438     7.59995     7.41750    10.15353
|     1 /  10 :   0.761500     4.46904     4.37081     6.12175
|    10 /  10 :   0.779438     6.72776     6.62458     9.55874
|   100 /  10 :   1.512875     7.74771     7.55823    10.30993
|     1 / 100 :   0.580828     4.53871     4.40472     6.18076
|    10 / 100 :   0.703375     6.48999     6.31845     9.03147
|   100 / 100 :   1.917000     9.03255     8.49599    11.50894
|    1 / 1000 :   0.758219     4.59623     4.56923     6.31821
|   10 / 1000 :   1.198875     7.10303     7.03001     9.74554
|  100 / 1000 :   5.823250    16.21902    15.56194    16.60883