slow for-loops

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

slow for-loops

Stef Pillaert BK-2
Hello,

I translate a lot of my functions to .oct files, and one of the main
reasons is that I do a lot of (simple, short) loops.

A sample .m-file like:

function retval=test_lus(times)
 for i=1:times
 endfor
 disp ("m-file done");
 retval=1;
endfunction

takes a LOT more time then it's .oct-counterpart:

#include <octave/oct.h>
#include "iostream.h"

DEFUN_DLD (test_lus_oct,args,,
"")
 {

  int times = int(args(0).double_value());
  for (int i=0 ; i < times ; i++) {}
  cout << "oct-file done";
  return octave_value(1);
 }

The second function runs on my system about 500 times faster!!(I set
times=1000000)

Is this behaviour normal, and if so, is there an alternative (without the
need of translating everything to .oct-files, because this is a lot more
work then writing simple .m-files...).

Thanks,

Stef.



Reply | Threaded
Open this post in threaded view
|

slow for-loops

John W. Eaton-6
On 30-Mar-1998, Stef Pillaert <[hidden email]> wrote:

| I translate a lot of my functions to .oct files, and one of the main
| reasons is that I do a lot of (simple, short) loops.
|
| A sample .m-file like:
|
| function retval=test_lus(times)
|  for i=1:times
|  endfor
|  disp ("m-file done");
|  retval=1;
| endfunction
|
| takes a LOT more time then it's .oct-counterpart:
|
| #include <octave/oct.h>
| #include "iostream.h"
|
| DEFUN_DLD (test_lus_oct,args,,
| "")
|  {
|
|   int times = int(args(0).double_value());
|   for (int i=0 ; i < times ; i++) {}
|   cout << "oct-file done";
|   return octave_value(1);
|  }
|
| The second function runs on my system about 500 times faster!!(I set
| times=1000000)
|
| Is this behaviour normal, and if so, is there an alternative (without the
| need of translating everything to .oct-files, because this is a lot more
| work then writing simple .m-files...).

It is not particularly surprising that the M-file version takes much
longer.  Your example is a little bit unrealistic though, because
nothing happens in the loop.  So most optimizing compilers probably
convert the C++ function to simply

  i = times;

but Octave slogs through all the assignments to i in turn, for
nothing.  BTW, before someone says that Octave should do the same as a
good optimizing compiler and eliminate the loop, I'd like to say don't
think this is really worth trying to fix.  After all, Octave is
supposed to be doing real computations, not proving how fast it can do
nothing.

If we make the example a bit more fair by actually computing something
in the loop, the comparison is a little better.  For example, for

  #include <octave/oct.h>
  #include "iostream.h"

  DEFUN_DLD (test_lus_oct, args, ,
    "test_lus_oct (times)")
   {
     octave_value retval;

     int nargin = args.length ();

     if (nargin == 1)
       {
         int times = int(args(0).double_value());

         if (! error_state)
           {
             ColumnVector tmp (times, 0.0);

             for (int i=0 ; i < times ; i++)
               tmp(i) = sin(i);

             retval = octave_value (tmp, 0);
           }
         else
           error ("test_lus_oct: invalid argument");
       }
     else
       print_usage ("test_lus_oct");

     return retval;
   }

vs.

  function retval = test_lus (times)
    if (nargin == 1)
      if (is_scalar (times))
        retval = zeros (times, 1);
        for i = 1:times
          retval(i) = sin (1:times);
        endfor
      else
        error ("test_lus: invalid argument");
      endif
    else
      usage ("test_lus (times)");
    endif
  endfunction

the ratio on my system drops to about 120.  That's still not very
good, but no one would really do something like that would they?

Finally, the M-file version is actually as fast or even a little bit
faster than the .oct file if I take advantage of vecctor operations by
rewriting it like this:

  function retval = test_lus (times)
    if (nargin == 1)
      if (is_scalar (times))
        retval = sin (1:times);
      else
        error ("test_lus: invalid argument");
      endif
    else
      usage ("test_lus (times)");
    endif
  endfunction

The moral is that if you are concerned with speed, you should try to
write vector operations if you are programming in Octave.  That's what
it was designed to do best.  If that's not possible, then perhaps .oct
files are the best way to go.

However, if anyone would like to help work on making Octave's
interpreter faster, I'd love to hear from them.

Thanks,

jwe