indexing expression performance

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

indexing expression performance

Nicholas Tung-2
I've noticed that indexing expressions on large arrays can be quite slow, even if the indexes are mostly sequential. In the code below, the first loop takes 1.91 seconds while the second takes 0.11 seconds. Any suggestions on how to get around this would be much appreciated, even looking at Octave's source.

a = rand(300);
tic
for i = 1:100
    a(1:(size(a, 1) - 1), :) += a(2:size(a, 1), :);
endfor
toc

tic
for i = 1:100
    a += a;
endfor
toc

thanks,
Nicholas

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
On Tue, Jan 13, 2009 at 7:13 PM, Nicholas Tung <[hidden email]> wrote:

> I've noticed that indexing expressions on large arrays can be quite slow,
> even if the indexes are mostly sequential. In the code below, the first loop
> takes 1.91 seconds while the second takes 0.11 seconds. Any suggestions on
> how to get around this would be much appreciated, even looking at Octave's
> source.
>
> a = rand(300);
> tic
> for i = 1:100
>     a(1:(size(a, 1) - 1), :) += a(2:size(a, 1), :);
> endfor
> toc
>
> tic
> for i = 1:100
>     a += a;
> endfor
> toc
>
> thanks,
> Nicholas
>

What version of Octave's sources did you use? Dense indexing speed has
been vastly improved in the development sources (but after 3.1.51,
IIRC).



--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Francesco Potortì
In reply to this post by Nicholas Tung-2
>I've noticed that indexing expressions on large arrays can be quite slow,

Have a look at this, which I wrote in November but for some reason does
not appear in the list archives.  In an application of mine, I wrote an
ad hoc function to access a 5-dim matrix using the last method below,
getting a significant speedup.  I usually run 3.0.1, and I did not make
any check to compare its speed with the more recent Octave versions.

|Date: 11 Nov 2008 10:44:45 +0100
|From: Francesco Potortì <[hidden email]>
|To: Jaroslav Hajek <[hidden email]>
|CC: Octave help list <[hidden email]>
|Subject: Re: slow access to consecutive matrix elements
|
|>On Sat, Oct 11, 2008 at 10:22 AM, Francesco Potorti`
|><[hidden email]> wrote:
|>> This is a real life example that demonstrates how access to matrix
|>> elements is not optimised for the case of complete ranges in the first
|>> dimensions.  I am sending it here and not to the bug list because this
|>> example may be useful to others, at least until this cases is optimised
|>> in the sources.
|>>
|>> # This is a big 5-dim matrix, about 100MB size
|>> octave> kk=rand(156,222,1,44,8);
|>>
|>> # I access a slice of it, which is sequential in memory
|>> octave> t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
|>> ans =  5.9124
|>>
|>> # Using a linear index is much faster
|>> octave> span=(1:156*222);
|>>        t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(sub2ind(size(kk),1,1,1,ii,jj)+span-1); endfor, endfor, cputime-t
|>> ans =  2.6642
|>>
|>> # Removing the call to sub2ind reaches top speed
|>> octave> cp=[1,cumprod(size(kk)(1:end-1))]; span=(1:156*222);
|>>        t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(sum(([1,1,1,ii,jj]-1).*cp)+span); endfor, endfor, cputime-t
|>> ans =  1.4001
|>>
|>> Still, I wish access were faster yet.  Is there a reason why copying
|>> consecutive memory is so slow?  I wish I could help with optimising
|>> this, even if I am certainly not the most qualified person to do it.
|>>
|>
|>I guess the indexing routines do deserve some attention w.r.t
|>performance. Reducing code duplication would also be nice. I have this
|>on my TODO list, but I don't think it's a good idea to do it before
|>3.2.x is forked, as such changes are, IMO, likely to introduce bugs.
|
|Follwoing up on this, I realised that there is room for further
|significant speedup:
|
|# Be careful to not convert ranges into matrices
|octave> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222;
|      t=cputime; for ii=1:44, for jj=1:8
|base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor, endfor, cputime-t
|ans =  0.26402
|
|The fact is, I was discounting the fact that a range remains a range
|even after linear transformation, while this does not appear to be the
|case:
|
|octave3.1> typeinfo(1:4)
|ans = range
|octave3.1> typeinfo(4+(1:4))
|ans = matrix
|octave3.1> typeinfo(4*(1:4))
|ans = matrix
|
|>From the depth of my ignorance about Octave's internals, I dare say that
|it should not be too difficult to keep ranges as ranges even after sum
|or product with a scalar.  Maybe even after sum with a range with the
|same number of elements.  Am I wrong?
|
|--
|Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
|ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
|via G. Moruzzi 1, I-56124 Pisa         Email: [hidden email]
|(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/
|
|
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Nicholas Tung-2
On Tue, Jan 13, 2009 at 16:46, Francesco Potortì <[hidden email]> wrote:
>I've noticed that indexing expressions on large arrays can be quite slow,

Have a look at this, which I wrote in November but for some reason does
not appear in the list archives.  In an application of mine, I wrote an
ad hoc function to access a 5-dim matrix using the last method below,
getting a significant speedup.  I usually run 3.0.1, and I did not make
any check to compare its speed with the more recent Octave versions.

|Date: 11 Nov 2008 10:44:45 +0100
|From: Francesco Potortì <[hidden email]>
|To: Jaroslav Hajek <[hidden email]>
|CC: Octave help list <[hidden email]>
|Subject: Re: slow access to consecutive matrix elements
|
|>On Sat, Oct 11, 2008 at 10:22 AM, Francesco Potorti`
|><[hidden email]> wrote:
|>> This is a real life example that demonstrates how access to matrix
|>> elements is not optimised for the case of complete ranges in the first
|>> dimensions.  I am sending it here and not to the bug list because this
|>> example may be useful to others, at least until this cases is optimised
|>> in the sources.
|>>
|>> # This is a big 5-dim matrix, about 100MB size
|>> octave> kk=rand(156,222,1,44,8);
|>>
|>> # I access a slice of it, which is sequential in memory
|>> octave> t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
|>> ans =  5.9124
|>>
|>> # Using a linear index is much faster
|>> octave> span=(1:156*222);
|>>        t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(sub2ind(size(kk),1,1,1,ii,jj)+span-1); endfor, endfor, cputime-t
|>> ans =  2.6642
|>>
|>> # Removing the call to sub2ind reaches top speed
|>> octave> cp=[1,cumprod(size(kk)(1:end-1))]; span=(1:156*222);
|>>        t=cputime; for ii=1:44, for jj=1:8
|>>  mm=kk(sum(([1,1,1,ii,jj]-1).*cp)+span); endfor, endfor, cputime-t
|>> ans =  1.4001
|>>
|>> Still, I wish access were faster yet.  Is there a reason why copying
|>> consecutive memory is so slow?  I wish I could help with optimising
|>> this, even if I am certainly not the most qualified person to do it.
|>>
|>
|>I guess the indexing routines do deserve some attention w.r.t
|>performance. Reducing code duplication would also be nice. I have this
|>on my TODO list, but I don't think it's a good idea to do it before
|>3.2.x is forked, as such changes are, IMO, likely to introduce bugs.
|
|Follwoing up on this, I realised that there is room for further
|significant speedup:
|
|# Be careful to not convert ranges into matrices
|octave> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222;
|      t=cputime; for ii=1:44, for jj=1:8
|base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor, endfor, cputime-t
|ans =  0.26402
|
|The fact is, I was discounting the fact that a range remains a range
|even after linear transformation, while this does not appear to be the
|case:
|
|octave3.1> typeinfo(1:4)
|ans = range
|octave3.1> typeinfo(4+(1:4))
|ans = matrix
|octave3.1> typeinfo(4*(1:4))
|ans = matrix
|
|>From the depth of my ignorance about Octave's internals, I dare say that
|it should not be too difficult to keep ranges as ranges even after sum
|or product with a scalar.  Maybe even after sum with a range with the
|same number of elements.  Am I wrong?

neat, I didn't know that 1-d vector indexing worked on matrices. However, I think it may be faster for you because of the multi-dimensional aspect, whereas I am working with more values (potentially large image samples). On Octave 3.1.51+ (built today) the following code runs in about the same time. If you're doing something more complicated, please tell me. I know the two snippets below don't compute exactly the same value...

tic, x1 = 1:299; x2 = 2:300; for i = 1:100, a(:, x1) += a(:, x2); endfor, toc
x1 = 1:89700; x2 = 301:90000; tic, for i = 1:100, a(x1) += a(x2); endfor, toc

As another note, it doesn't seem to matter that much which way the matrix is indexed for the first, i.e. a(x1, :) vs a(:, x1).

Jaroslav - I updated to hg head and it was faster (cut time in half for the test example), but on real code not quite as much as I'd hoped.

Nicholas

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Nicholas Tung-2
On Wed, Jan 14, 2009 at 01:36, Jaroslav Hajek <[hidden email]> wrote:
On Wed, Jan 14, 2009 at 5:00 AM, Nicholas Tung <[hidden email]> wrote:
> On Tue, Jan 13, 2009 at 16:46, Francesco Potortì <[hidden email]>
> wrote:
>>
>> >I've noticed that indexing expressions on large arrays can be quite slow,
>>
>> Have a look at this, which I wrote in November but for some reason does
>> not appear in the list archives.  In an application of mine, I wrote an
>> ad hoc function to access a 5-dim matrix using the last method below,
>> getting a significant speedup.  I usually run 3.0.1, and I did not make
>> any check to compare its speed with the more recent Octave versions.
>>
>> |Date: 11 Nov 2008 10:44:45 +0100
>> |From: Francesco Potortì <[hidden email]>
>> |To: Jaroslav Hajek <[hidden email]>
>> |CC: Octave help list <[hidden email]>
>> |Subject: Re: slow access to consecutive matrix elements
>> |
>> |>On Sat, Oct 11, 2008 at 10:22 AM, Francesco Potorti`
>> |><[hidden email]> wrote:
>> |>> This is a real life example that demonstrates how access to matrix
>> |>> elements is not optimised for the case of complete ranges in the first
>> |>> dimensions.  I am sending it here and not to the bug list because this
>> |>> example may be useful to others, at least until this cases is
>> optimised
>> |>> in the sources.
>> |>>
>> |>> # This is a big 5-dim matrix, about 100MB size
>> |>> octave> kk=rand(156,222,1,44,8);
>> |>>
>> |>> # I access a slice of it, which is sequential in memory
>> |>> octave> t=cputime; for ii=1:44, for jj=1:8
>> |>>  mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
>> |>> ans =  5.9124
>> |>>
>> |>> # Using a linear index is much faster
>> |>> octave> span=(1:156*222);
>> |>>        t=cputime; for ii=1:44, for jj=1:8
>> |>>  mm=kk(sub2ind(size(kk),1,1,1,ii,jj)+span-1); endfor, endfor,
>> cputime-t
>> |>> ans =  2.6642
>> |>>
>> |>> # Removing the call to sub2ind reaches top speed
>> |>> octave> cp=[1,cumprod(size(kk)(1:end-1))]; span=(1:156*222);
>> |>>        t=cputime; for ii=1:44, for jj=1:8
>> |>>  mm=kk(sum(([1,1,1,ii,jj]-1).*cp)+span); endfor, endfor, cputime-t
>> |>> ans =  1.4001
>> |>>
>> |>> Still, I wish access were faster yet.  Is there a reason why copying
>> |>> consecutive memory is so slow?  I wish I could help with optimising
>> |>> this, even if I am certainly not the most qualified person to do it.
>> |>>
>> |>
>> |>I guess the indexing routines do deserve some attention w.r.t
>> |>performance. Reducing code duplication would also be nice. I have this
>> |>on my TODO list, but I don't think it's a good idea to do it before
>> |>3.2.x is forked, as such changes are, IMO, likely to introduce bugs.
>> |
>> |Follwoing up on this, I realised that there is room for further
>> |significant speedup:
>> |
>> |# Be careful to not convert ranges into matrices
>> |octave> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222;
>> |      t=cputime; for ii=1:44, for jj=1:8
>> |base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor, endfor,
>> cputime-t
>> |ans =  0.26402
>> |
>> |The fact is, I was discounting the fact that a range remains a range
>> |even after linear transformation, while this does not appear to be the
>> |case:
>> |
>> |octave3.1> typeinfo(1:4)
>> |ans = range
>> |octave3.1> typeinfo(4+(1:4))
>> |ans = matrix
>> |octave3.1> typeinfo(4*(1:4))
>> |ans = matrix
>> |
>> |>From the depth of my ignorance about Octave's internals, I dare say that
>> |it should not be too difficult to keep ranges as ranges even after sum
>> |or product with a scalar.  Maybe even after sum with a range with the
>> |same number of elements.  Am I wrong?
>
> neat, I didn't know that 1-d vector indexing worked on matrices. However, I
> think it may be faster for you because of the multi-dimensional aspect,
> whereas I am working with more values (potentially large image samples). On
> Octave 3.1.51+ (built today) the following code runs in about the same time.
> If you're doing something more complicated, please tell me. I know the two
> snippets below don't compute exactly the same value...
>
> tic, x1 = 1:299; x2 = 2:300; for i = 1:100, a(:, x1) += a(:, x2); endfor,
> toc
> x1 = 1:89700; x2 = 301:90000; tic, for i = 1:100, a(x1) += a(x2); endfor,
> toc
>
> As another note, it doesn't seem to matter that much which way the matrix is
> indexed for the first, i.e. a(x1, :) vs a(:, x1).
>
> Jaroslav - I updated to hg head and it was faster (cut time in half for the
> test example), but on real code not quite as much as I'd hoped.
>
> Nicholas
>

Hi Nicholas,

if you can identify bottlenecks and post them as reasonably small
self-contained examples, we can surely discuss them and possible
improvements.

regards

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

Thanks so much. I uploaded a real example to
http://gatoatigrado.arvixe.com/files/temp/octave_indexing_perf.tgz

The script is 60 lines long; it should perform a two-step Haar transformation (actually only for the wavelet coefficients) lifting method decomposition. Run it with the following:
octave encode_image_standalone.m attachment.jpg

I looped it 450 times to simulate the amount of processing that I need. On my machine, the base time is 26.6 seconds; if you remove the matrix scalar indexing (original file lines 36, 37 and 43, 44), it still takes 17 seconds just to run through the interpreter. Replacing both of the lines above with
gamma_ += alpha_arr(shift_idx) * gamma_;
lambda_ += alpha_arr(shift_idx) * lambda_;
respectively, it takes about 20 seconds, which is much closer to the interpreter overhead (which I might be able to cut down by reducing the number of lines; I'm not sure.

Also, Francesco, I forgot to do the obvious thing and test your examples with 3.1.51. At least on my machine, it looks like the performance has been reversed by the latest dense indexing improvements.

octave:1> kk=rand(156,222,1,44,8);
octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
ans =  0.072004
octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor, endfor, cputime-t
ans =  0.12401

Thanks again,
Nicholas

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Francesco Potortì
>Also, Francesco, I forgot to do the obvious thing and test your examples
>with 3.1.51. At least on my machine, it looks like the performance has been
>reversed by the latest dense indexing improvements.
>
>octave:1> kk=rand(156,222,1,44,8);
>octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor,
>endfor, cputime-t
>ans =  0.072004
>octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for
>ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len);
>endfor, endfor, cputime-t
>ans =  0.12401

Wonderful!  This will make my code more readable as soon as I upgrade :)

Thank you Jaroslav!

--
Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
via G. Moruzzi 1, I-56124 Pisa         Email: [hidden email]
(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
On Wed, Jan 14, 2009 at 5:36 PM, Francesco Potortì <[hidden email]> wrote:

>>Also, Francesco, I forgot to do the obvious thing and test your examples
>>with 3.1.51. At least on my machine, it looks like the performance has been
>>reversed by the latest dense indexing improvements.
>>
>>octave:1> kk=rand(156,222,1,44,8);
>>octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor,
>>endfor, cputime-t
>>ans =  0.072004
>>octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for
>>ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len);
>>endfor, endfor, cputime-t
>>ans =  0.12401
>
> Wonderful!  This will make my code more readable as soon as I upgrade :)
>
> Thank you Jaroslav!
>

You're welcome. After all, your original report has been the starting
impulse for that work, so part of the credit goes to you... it's hard
to work on performance improvements without such rational starting
impulses (by rational I mean not the sort of "hey guys, Octave sucks,
'cause my code is slow on it, do something 'bout it"), because there's
an awful lot of places in Octave that are amenable to some
optimizations, but of course not everything is equally painful. So,
working simplistic benchmarks that grasp the core of some performance
problem are precious stuff.
Of course, they're best done with development sources, but you know that :)

You may also be interested in the most recent indexing improvements:
http://hg.savannah.gnu.org/hgweb/octave/rev/ad3afaaa19c1
http://www.nabble.com/lazy-contiguous-subrange-indexing-td21458811.html

this will cheat your first loop to run in almost zero time...

cheers

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Francesco Potortì
>You may also be interested in the most recent indexing improvements:
>http://hg.savannah.gnu.org/hgweb/octave/rev/ad3afaaa19c1
>http://www.nabble.com/lazy-contiguous-subrange-indexing-td21458811.html
>
>this will cheat your first loop to run in almost zero time...

This is very interesting.  In fact, I routinely extract (usually memory
contiguous) 2-d slices from a big multidim matrix to sum them, without
ever writing to the matrix.  To work around the slow indexing problem, I
resorted to the linear indexin trick, followed by resizing.  I think the
above improvements will significantly speed up my code, which is
important to me as I extract a big number of slices to make Montecarlo
computations.

Thanks again :)

--
Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
via G. Moruzzi 1, I-56124 Pisa         Email: [hidden email]
(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Francesco Potortì
In reply to this post by Nicholas Tung-2
I had written:

>>Also, Francesco, I forgot to do the obvious thing and test your examples
>>with 3.1.51. At least on my machine, it looks like the performance has been
>>reversed by the latest dense indexing improvements.
>>
>>octave:1> kk=rand(156,222,1,44,8);
>>octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor,
>>endfor, cputime-t
>>ans =  0.072004
>>octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for
>>ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len);
>>endfor, endfor, cputime-t
>>ans =  0.12401
>
>Wonderful!  This will make my code more readable as soon as I upgrade :)

Hm.  I installed Octave 3.1.51 from Debian's "experimental" (I never got
round to finding the time to build Octave myself).  I use an amd64 box.
Unfortunately, I see no big difference with 3.0.1 :(

The timings of the first test versus the second on my box are:
- 5.1s versus 0.30s with 3.0.1
- 5.2s versus 0.27s with 3.1.51

So apparently I am not benefitting from the 3.1.51 dense indexing improvements.

--
Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
via G. Moruzzi 1, I-56124 Pisa         Email: [hidden email]
(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
On Thu, Jan 15, 2009 at 1:48 PM, Francesco Potortì <[hidden email]> wrote:

> I had written:
>>>Also, Francesco, I forgot to do the obvious thing and test your examples
>>>with 3.1.51. At least on my machine, it looks like the performance has been
>>>reversed by the latest dense indexing improvements.
>>>
>>>octave:1> kk=rand(156,222,1,44,8);
>>>octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor,
>>>endfor, cputime-t
>>>ans =  0.072004
>>>octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for
>>>ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len);
>>>endfor, endfor, cputime-t
>>>ans =  0.12401
>>
>>Wonderful!  This will make my code more readable as soon as I upgrade :)
>
> Hm.  I installed Octave 3.1.51 from Debian's "experimental" (I never got
> round to finding the time to build Octave myself).  I use an amd64 box.
> Unfortunately, I see no big difference with 3.0.1 :(
>
> The timings of the first test versus the second on my box are:
> - 5.1s versus 0.30s with 3.0.1
> - 5.2s versus 0.27s with 3.1.51
>
> So apparently I am not benefitting from the 3.1.51 dense indexing improvements.
>

The dense indexing improvements started with
http://hg.savannah.gnu.org/hgweb/octave/rev/7cbe01c21986
dated 20th October and lots of bugfixes afterwards. I don't think this
was in the 3.1.51 snapshot which I think was done in late July. There
was no 3.1.52 so far. So, if you want these, development sources are
your only resort.

regards

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Nicholas Tung-2
In reply to this post by Nicholas Tung-2
On Thu, Jan 15, 2009 at 08:17, Jaroslav Hajek <[hidden email]> wrote:
On Wed, Jan 14, 2009 at 5:24 PM, Nicholas Tung <[hidden email]> wrote:
> On Wed, Jan 14, 2009 at 01:36, Jaroslav Hajek <[hidden email]> wrote:
>>
>> On Wed, Jan 14, 2009 at 5:00 AM, Nicholas Tung <[hidden email]>
>> wrote:
>> > On Tue, Jan 13, 2009 at 16:46, Francesco Potortì <[hidden email]>
>> > wrote:
>> >>
>> >> >I've noticed that indexing expressions on large arrays can be quite
>> >> > slow,
>> >>
>> >> Have a look at this, which I wrote in November but for some reason does
>> >> not appear in the list archives.  In an application of mine, I wrote an
>> >> ad hoc function to access a 5-dim matrix using the last method below,
>> >> getting a significant speedup.  I usually run 3.0.1, and I did not make
>> >> any check to compare its speed with the more recent Octave versions.
>> >>
>> >> |Date: 11 Nov 2008 10:44:45 +0100
>> >> |From: Francesco Potortì <[hidden email]>
>> >> |To: Jaroslav Hajek <[hidden email]>
>> >> |CC: Octave help list <[hidden email]>
>> >> |Subject: Re: slow access to consecutive matrix elements
>> >> |
>> >> |>On Sat, Oct 11, 2008 at 10:22 AM, Francesco Potorti`
>> >> |><[hidden email]> wrote:
>> >> |>> This is a real life example that demonstrates how access to matrix
>> >> |>> elements is not optimised for the case of complete ranges in the
>> >> first
>> >> |>> dimensions.  I am sending it here and not to the bug list because
>> >> this
>> >> |>> example may be useful to others, at least until this cases is
>> >> optimised
>> >> |>> in the sources.
>> >> |>>
>> >> |>> # This is a big 5-dim matrix, about 100MB size
>> >> |>> octave> kk=rand(156,222,1,44,8);
>> >> |>>
>> >> |>> # I access a slice of it, which is sequential in memory
>> >> |>> octave> t=cputime; for ii=1:44, for jj=1:8
>> >> |>>  mm=kk(:,:,:,ii,jj); endfor, endfor, cputime-t
>> >> |>> ans =  5.9124
>> >> |>>
>> >> |>> # Using a linear index is much faster
>> >> |>> octave> span=(1:156*222);
>> >> |>>        t=cputime; for ii=1:44, for jj=1:8
>> >> |>>  mm=kk(sub2ind(size(kk),1,1,1,ii,jj)+span-1); endfor, endfor,
>> >> cputime-t
>> >> |>> ans =  2.6642
>> >> |>>
>> >> |>> # Removing the call to sub2ind reaches top speed
>> >> |>> octave> cp=[1,cumprod(size(kk)(1:end-1))]; span=(1:156*222);
>> >> |>>        t=cputime; for ii=1:44, for jj=1:8
>> >> |>>  mm=kk(sum(([1,1,1,ii,jj]-1).*cp)+span); endfor, endfor, cputime-t
>> >> |>> ans =  1.4001
>> >> |>>
>> >> |>> Still, I wish access were faster yet.  Is there a reason why
>> >> copying
>> >> |>> consecutive memory is so slow?  I wish I could help with optimising
>> >> |>> this, even if I am certainly not the most qualified person to do
>> >> it.
>> >> |>>
>> >> |>
>> >> |>I guess the indexing routines do deserve some attention w.r.t
>> >> |>performance. Reducing code duplication would also be nice. I have
>> >> this
>> >> |>on my TODO list, but I don't think it's a good idea to do it before
>> >> |>3.2.x is forked, as such changes are, IMO, likely to introduce bugs.
>> >> |
>> >> |Follwoing up on this, I realised that there is room for further
>> >> |significant speedup:
>> >> |
>> >> |# Be careful to not convert ranges into matrices
>> >> |octave> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222;
>> >> |      t=cputime; for ii=1:44, for jj=1:8
>> >> |base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len); endfor,
>> >> endfor,
>> >> cputime-t
>> >> |ans =  0.26402
>> >> |
>> >> |The fact is, I was discounting the fact that a range remains a range
>> >> |even after linear transformation, while this does not appear to be the
>> >> |case:
>> >> |
>> >> |octave3.1> typeinfo(1:4)
>> >> |ans = range
>> >> |octave3.1> typeinfo(4+(1:4))
>> >> |ans = matrix
>> >> |octave3.1> typeinfo(4*(1:4))
>> >> |ans = matrix
>> >> |
>> >> |>From the depth of my ignorance about Octave's internals, I dare say
>> >> that
>> >> |it should not be too difficult to keep ranges as ranges even after sum
>> >> |or product with a scalar.  Maybe even after sum with a range with the
>> >> |same number of elements.  Am I wrong?
>> >
>> > neat, I didn't know that 1-d vector indexing worked on matrices.
>> > However, I
>> > think it may be faster for you because of the multi-dimensional aspect,
>> > whereas I am working with more values (potentially large image samples).
>> > On
>> > Octave 3.1.51+ (built today) the following code runs in about the same
>> > time.
>> > If you're doing something more complicated, please tell me. I know the
>> > two
>> > snippets below don't compute exactly the same value...
>> >
>> > tic, x1 = 1:299; x2 = 2:300; for i = 1:100, a(:, x1) += a(:, x2);
>> > endfor,
>> > toc
>> > x1 = 1:89700; x2 = 301:90000; tic, for i = 1:100, a(x1) += a(x2);
>> > endfor,
>> > toc
>> >
>> > As another note, it doesn't seem to matter that much which way the
>> > matrix is
>> > indexed for the first, i.e. a(x1, :) vs a(:, x1).
>> >
>> > Jaroslav - I updated to hg head and it was faster (cut time in half for
>> > the
>> > test example), but on real code not quite as much as I'd hoped.
>> >
>> > Nicholas
>> >
>>
>> Hi Nicholas,
>>
>> if you can identify bottlenecks and post them as reasonably small
>> self-contained examples, we can surely discuss them and possible
>> improvements.
>>
>> regards
>>
>> --
>> RNDr. Jaroslav Hajek
>> computing expert
>> Aeronautical Research and Test Institute (VZLU)
>> Prague, Czech Republic
>> url: www.highegg.matfyz.cz
>
> Thanks so much. I uploaded a real example to
> http://gatoatigrado.arvixe.com/files/temp/octave_indexing_perf.tgz
>
> The script is 60 lines long; it should perform a two-step Haar
> transformation (actually only for the wavelet coefficients) lifting method
> decomposition. Run it with the following:
> octave encode_image_standalone.m attachment.jpg
>
> I looped it 450 times to simulate the amount of processing that I need. On
> my machine, the base time is 26.6 seconds; if you remove the matrix scalar
> indexing (original file lines 36, 37 and 43, 44), it still takes 17 seconds
> just to run through the interpreter. Replacing both of the lines above with
> gamma_ += alpha_arr(shift_idx) * gamma_;
> lambda_ += alpha_arr(shift_idx) * lambda_;
> respectively, it takes about 20 seconds, which is much closer to the
> interpreter overhead (which I might be able to cut down by reducing the
> number of lines; I'm not sure.
>

20 seconds are certainly not the interpreter overhead - I daresay that
is almost negligible here. Apart from the lines you've picked up, most
time is spent extracting gamma and lambda from the matrix and putting
it back, and, more importantly, transposing it twice. Transposition is
a relatively costly operation for large matrices because it is not
cache friendly (even though David Bateman optimized it somewhat to
reduce the number of cache misses, so it's faster in development
sources).

It's not so large, 72x128. With only extraction [and transformation], i.e. commenting lines 26 to 47, it only takes 0.71 seconds. Tell me if you get different results. I'm running this on a Core Duo [not core 2 duo] U2400; it has 2mb of cache.

I didn't check very closely but it appears to me you're really doing a
row & column convolution. Perhaps conv2 can do the job for you? That
would also avoid the transpose, so it should make your code ultimately
faster.
Conv2 is a compiled function and so should be fast, though I guess
(from a quick scan) there's room to improve performance even there.

That's right! Thanks so much! That cuts it down to about 9.5 seconds.


_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Thomas Weber-8
In reply to this post by Jaroslav Hajek-2
Am Donnerstag, den 15.01.2009, 13:57 +0100 schrieb Jaroslav Hajek:

> On Thu, Jan 15, 2009 at 1:48 PM, Francesco Potortì <[hidden email]> wrote:
> > I had written:
> >>>Also, Francesco, I forgot to do the obvious thing and test your examples
> >>>with 3.1.51. At least on my machine, it looks like the performance has been
> >>>reversed by the latest dense indexing improvements.
> >>>
> >>>octave:1> kk=rand(156,222,1,44,8);
> >>>octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor,
> >>>endfor, cputime-t
> >>>ans =  0.072004
> >>>octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for
> >>>ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len);
> >>>endfor, endfor, cputime-t
> >>>ans =  0.12401
> >>
> >>Wonderful!  This will make my code more readable as soon as I upgrade :)
> >
> > Hm.  I installed Octave 3.1.51 from Debian's "experimental" (I never got
> > round to finding the time to build Octave myself).  I use an amd64 box.
> > Unfortunately, I see no big difference with 3.0.1 :(
> >
> > The timings of the first test versus the second on my box are:
> > - 5.1s versus 0.30s with 3.0.1
> > - 5.2s versus 0.27s with 3.1.51
> >
> > So apparently I am not benefitting from the 3.1.51 dense indexing improvements.
> >
>
> The dense indexing improvements started with
> http://hg.savannah.gnu.org/hgweb/octave/rev/7cbe01c21986
> dated 20th October and lots of bugfixes afterwards. I don't think this
> was in the 3.1.51 snapshot which I think was done in late July. There
> was no 3.1.52 so far. So, if you want these, development sources are
> your only resort.

I'd vote for a new snapshot. Or, if you can give me a mercurial id where
sources are in a reasonable[1] shape, we can use that for a new snapshot
package in Debian.

[1]  Reasonable by whatever metric you choose.

        Thomas

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
On Fri, Jan 16, 2009 at 9:30 AM, Thomas Weber
<[hidden email]> wrote:

> Am Donnerstag, den 15.01.2009, 13:57 +0100 schrieb Jaroslav Hajek:
>> On Thu, Jan 15, 2009 at 1:48 PM, Francesco Potortì <[hidden email]> wrote:
>> > I had written:
>> >>>Also, Francesco, I forgot to do the obvious thing and test your examples
>> >>>with 3.1.51. At least on my machine, it looks like the performance has been
>> >>>reversed by the latest dense indexing improvements.
>> >>>
>> >>>octave:1> kk=rand(156,222,1,44,8);
>> >>>octave:2> t=cputime; for ii=1:44, for jj=1:8, mm=kk(:,:,:,ii,jj); endfor,
>> >>>endfor, cputime-t
>> >>>ans =  0.072004
>> >>>octave:3> cp=[1,cumprod(size(kk)(1:end-1))]; len=156*222; t=cputime; for
>> >>>ii=1:44, for jj=1:8,base=sum(([1,1,1,ii,jj]-1).*cp); mm=kk(base+1:base+len);
>> >>>endfor, endfor, cputime-t
>> >>>ans =  0.12401
>> >>
>> >>Wonderful!  This will make my code more readable as soon as I upgrade :)
>> >
>> > Hm.  I installed Octave 3.1.51 from Debian's "experimental" (I never got
>> > round to finding the time to build Octave myself).  I use an amd64 box.
>> > Unfortunately, I see no big difference with 3.0.1 :(
>> >
>> > The timings of the first test versus the second on my box are:
>> > - 5.1s versus 0.30s with 3.0.1
>> > - 5.2s versus 0.27s with 3.1.51
>> >
>> > So apparently I am not benefitting from the 3.1.51 dense indexing improvements.
>> >
>>
>> The dense indexing improvements started with
>> http://hg.savannah.gnu.org/hgweb/octave/rev/7cbe01c21986
>> dated 20th October and lots of bugfixes afterwards. I don't think this
>> was in the 3.1.51 snapshot which I think was done in late July. There
>> was no 3.1.52 so far. So, if you want these, development sources are
>> your only resort.
>
> I'd vote for a new snapshot. Or, if you can give me a mercurial id where
> sources are in a reasonable[1] shape, we can use that for a new snapshot
> package in Debian.
>
> [1]  Reasonable by whatever metric you choose.
>
>        Thomas
>

A very loose metric is as follows:
Checkout a current tip, and wait a whole day for possible patches
coined like "omission from last patch" etc.
If no such patch appears, you have a reasonable shape.

But, given that there were rumors about making 3.1.52 testing
snapshot, perhaps you'll want to wait for that?

My opinion for the best way to proceed is to fork 3-2-x branch at a
suitable time (maybe now) and then make testing and later stable
releases from it, but I think John has a different idea.

Btw., I see Debian has a pretty complete octave3.0 package.
How hard would be packaging the qrupdate library for Debian?
(https://sourceforge.net/projects/qrupdate).
This will be used in 3.2 to provide qrupdate, qrinsert, cholupdate
etc. and I'm not yet sure whether to keep
a snapshot in libcruft or make it a weak dependency (Octave will run
fine without the functions, but some
functions will lose performance).
I am fairly ignorant about packaging in GNU/Linux distros :(.

cheers

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Thomas Weber-8
Am Freitag, den 16.01.2009, 10:01 +0100 schrieb Jaroslav Hajek:

> >> The dense indexing improvements started with
> >> http://hg.savannah.gnu.org/hgweb/octave/rev/7cbe01c21986
> >> dated 20th October and lots of bugfixes afterwards. I don't think this
> >> was in the 3.1.51 snapshot which I think was done in late July. There
> >> was no 3.1.52 so far. So, if you want these, development sources are
> >> your only resort.
> >
> > I'd vote for a new snapshot. Or, if you can give me a mercurial id where
> > sources are in a reasonable[1] shape, we can use that for a new snapshot
> > package in Debian.
> >
> > [1]  Reasonable by whatever metric you choose.
> >
> >        Thomas
> >
>
> A very loose metric is as follows:
> Checkout a current tip, and wait a whole day for possible patches
> coined like "omission from last patch" etc.
> If no such patch appears, you have a reasonable shape.
>
> But, given that there were rumors about making 3.1.52 testing
> snapshot, perhaps you'll want to wait for that?

How about a snapshot _now_? I mean, it's a snapshot, it's expected to
contain bugs.

> My opinion for the best way to proceed is to fork 3-2-x branch at a
> suitable time (maybe now) and then make testing and later stable
> releases from it, but I think John has a different idea.
>
> Btw., I see Debian has a pretty complete octave3.0 package.
> How hard would be packaging the qrupdate library for Debian?
> (https://sourceforge.net/projects/qrupdate).

Good question. I don't have much experience with packaging libraries.
The biggest problem are incompatible changes to the library's interface,
which means a proper SONAME.

It's somewhat complicated by the fact that the library uses Fortran. For
C/C++, you have knowledgable people at every corner. How much do you
expect the library to change in the future?

        Thoma


_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
On Fri, Jan 16, 2009 at 10:47 AM, Thomas Weber
<[hidden email]> wrote:

> Am Freitag, den 16.01.2009, 10:01 +0100 schrieb Jaroslav Hajek:
>> >> The dense indexing improvements started with
>> >> http://hg.savannah.gnu.org/hgweb/octave/rev/7cbe01c21986
>> >> dated 20th October and lots of bugfixes afterwards. I don't think this
>> >> was in the 3.1.51 snapshot which I think was done in late July. There
>> >> was no 3.1.52 so far. So, if you want these, development sources are
>> >> your only resort.
>> >
>> > I'd vote for a new snapshot. Or, if you can give me a mercurial id where
>> > sources are in a reasonable[1] shape, we can use that for a new snapshot
>> > package in Debian.
>> >
>> > [1]  Reasonable by whatever metric you choose.
>> >
>> >        Thomas
>> >
>>
>> A very loose metric is as follows:
>> Checkout a current tip, and wait a whole day for possible patches
>> coined like "omission from last patch" etc.
>> If no such patch appears, you have a reasonable shape.
>>
>> But, given that there were rumors about making 3.1.52 testing
>> snapshot, perhaps you'll want to wait for that?
>
> How about a snapshot _now_? I mean, it's a snapshot, it's expected to
> contain bugs.
>
>> My opinion for the best way to proceed is to fork 3-2-x branch at a
>> suitable time (maybe now) and then make testing and later stable
>> releases from it, but I think John has a different idea.
>>
>> Btw., I see Debian has a pretty complete octave3.0 package.
>> How hard would be packaging the qrupdate library for Debian?
>> (https://sourceforge.net/projects/qrupdate).
>
> Good question. I don't have much experience with packaging libraries.
> The biggest problem are incompatible changes to the library's interface,
> which means a proper SONAME.
>
> It's somewhat complicated by the fact that the library uses Fortran. For
> C/C++, you have knowledgable people at every corner. How much do you
> expect the library to change in the future?
>

I will probably add more routines. The current set for QR & Cholesky
is fairly complete, but there are other factorizations left - QRZ, LU,
SVD... I guess adding a routine does not mean an incompatible change,
right? In that sense, I will aim for minimal changes.
Fortran is used for various reasons. My choices have only been C or
Fortran, because I want the lib to be easily usable from Fortran
programs - we here at VZLU Prague make fairly good use of Fortran for
in-house codes. As usually, Fortran won on its native support for
matrices, which need to be simulated by ugly macro tricks in C, or
simply coded in all-linear-arrays style. Also, Fortran doesn't tend to
be slower than C, sometimes even faster.
I think gfortran is generally regarded mature (at least w.r.t. F90
support) since gcc 4.2, so you can say Fortran is well supported on
free platforms.

regards

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
In reply to this post by Thomas Weber-8
On Fri, Jan 16, 2009 at 10:47 AM, Thomas Weber
<[hidden email]> wrote:

> Am Freitag, den 16.01.2009, 10:01 +0100 schrieb Jaroslav Hajek:
>> >> The dense indexing improvements started with
>> >> http://hg.savannah.gnu.org/hgweb/octave/rev/7cbe01c21986
>> >> dated 20th October and lots of bugfixes afterwards. I don't think this
>> >> was in the 3.1.51 snapshot which I think was done in late July. There
>> >> was no 3.1.52 so far. So, if you want these, development sources are
>> >> your only resort.
>> >
>> > I'd vote for a new snapshot. Or, if you can give me a mercurial id where
>> > sources are in a reasonable[1] shape, we can use that for a new snapshot
>> > package in Debian.
>> >
>> > [1]  Reasonable by whatever metric you choose.
>> >
>> >        Thomas
>> >
>>
>> A very loose metric is as follows:
>> Checkout a current tip, and wait a whole day for possible patches
>> coined like "omission from last patch" etc.
>> If no such patch appears, you have a reasonable shape.
>>
>> But, given that there were rumors about making 3.1.52 testing
>> snapshot, perhaps you'll want to wait for that?
>
> How about a snapshot _now_? I mean, it's a snapshot, it's expected to
> contain bugs.
>

Snapshots from development sources are in JWE's hands.

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jordi Gutiérrez Hermoso
In reply to this post by Thomas Weber-8
2009/1/16 Thomas Weber <[hidden email]>:

> Am Freitag, den 16.01.2009, 10:01 +0100 schrieb Jaroslav Hajek:
>> Btw., I see Debian has a pretty complete octave3.0 package.
>> How hard would be packaging the qrupdate library for Debian?
>> (https://sourceforge.net/projects/qrupdate).
>
> Good question. I don't have much experience with packaging libraries.
> The biggest problem are incompatible changes to the library's interface,
> which means a proper SONAME.
>
> It's somewhat complicated by the fact that the library uses Fortran. For
> C/C++, you have knowledgable people at every corner. How much do you
> expect the library to change in the future?

Well, we have some docs... I'm sure you've found them already:

      http://packages.debian.org/sid/libpkg-guide

But beware

      http://bugs.debian.org/libpkg-guide

I didn't quite follow the discussion, so please tell me again why it's
desirable to package this library?

- Jordi G. H.
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

John W. Eaton
Administrator
In reply to this post by Thomas Weber-8
On 16-Jan-2009, Thomas Weber wrote:

| How about a snapshot _now_? I mean, it's a snapshot, it's expected to
| contain bugs.

In that case, then just grab the sources from the Mercurial archive at
arbitrary times.  You can do that on your own.  You don't need to wait
for me.  But I would prefer to make snapshots when I think things are
in a reasonable state so that we are not flooded with a bunch of bug
reports for obvious problems from people who think we are idiots for
making "releases" that contain such stupid bugs.

At the moment, I'm trying to work through the backlog of bug reports
and submitted patches.  Once I've made some more progress on that,
I'll post to the maintainers list about what I think is left for the
3.2 release and then we can start making a series of snapshots leading
up to the release.

jwe
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: indexing expression performance

Jaroslav Hajek-2
In reply to this post by Jordi Gutiérrez Hermoso
On Fri, Jan 16, 2009 at 3:07 PM, Jordi Gutiérrez Hermoso
<[hidden email]> wrote:

> 2009/1/16 Thomas Weber <[hidden email]>:
>> Am Freitag, den 16.01.2009, 10:01 +0100 schrieb Jaroslav Hajek:
>>> Btw., I see Debian has a pretty complete octave3.0 package.
>>> How hard would be packaging the qrupdate library for Debian?
>>> (https://sourceforge.net/projects/qrupdate).
>>
>> Good question. I don't have much experience with packaging libraries.
>> The biggest problem are incompatible changes to the library's interface,
>> which means a proper SONAME.
>>
>> It's somewhat complicated by the fact that the library uses Fortran. For
>> C/C++, you have knowledgable people at every corner. How much do you
>> expect the library to change in the future?
>
> Well, we have some docs... I'm sure you've found them already:
>
>      http://packages.debian.org/sid/libpkg-guide
>
> But beware
>
>      http://bugs.debian.org/libpkg-guide
>
> I didn't quite follow the discussion, so please tell me again why it's
> desirable to package this library?
>

Octave uses it to implement the qrupdate, qrinsert, qrdelete, qrshift,
cholupdate, cholinsert, choldelete and cholshift functions. Up to now
a snapshot of version 0 of the lib was included directly in Octave
sources. During December I finally got to rewriting the library to get
a more robust interface and incorporate various improvements, and I
released it (it was GPLed from the beg) on SourceForge.
It is of course possible to carry on the inclusion in libcruft, but my
impression is that it is sort of the opposite direction we normally
go, so now I aim to remove it and simply check for qrupdate during
configure, disabling the relevant functionality if it is not found (we
do the same thing with other libs).
For those who build Octave from sources there is hardly a difference
(except that keeping a lib separate allows separate building), but it
has been pointed out that in a distro, a library needs to be packaged
and that can be problematic.
So, the question is, how much effort does it take to package a
(Fortran) library?
Will someone be willing to do it? I could probably help, if I'm told how.
I really have little knowledge here.
To make it more clear, if such a change is likely to results in the
state when in all distros you will get Octave without qrupdate, it may
be better to keep it in libcruft.

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave