mixed linear/logical indexing question

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

mixed linear/logical indexing question

Rik-4
Recently, sliding windows of statistics functions (movXXX) were added to
Octave.  All of these functions depend on a base function movfun.  As part
of implementing movfun I used the profiler to examine the performance and I
found a hot spot responsible for 70% of the run time.  Unfortunately, I
couldn't see any way to improve things without jumping out of the Octave
language in to C++.  I'm hoping someone else might look at this and have an
idea.

The code is below

--- Code Start ---
## Apply "shrink" boundary conditions
## Function is not applied to any window elements outside the original data.
function y = shrink_bc (fcn, x, idxp, win, wlen, odim)
  N   = length (x);
  idx = idxp + win;
  tf  = (idx > 0) & (idx <= N);  # idx inside boundaries

  n   = length (idxp);
  y   = zeros (n, odim);
  ## FIXME: This nested for loop accounts for 70% of running time.
  ##        Given that "shrink" is the default Endpoint value this
  ##        code needs to be reworked.
  for i = 1:n
    k      = idx(tf(:,i),i);
    y(i,:) = fcn (x(k));
  endfor
endfunction
--- Code End ---

In my case, the following conditions are a suitable example

+verbatim+
fcn = @mean;
x = randi (1e4, [1000, 1]);
idxp = 1:25;
win = (-25:25).';
wlen = [51, 51];
odim = 1;
-verbatim-

Indexing in Octave is insanely fast, but this is a weird case because it is
not simply linear indexing [x(5)], nor is it logical indexing [x(x < 1)]. 
It is a hybrid where the indices are linear (as would be returned from
find()) but only some of them are valid so they need to be masked with a
logical value.

When I run this code repeatedly in a benchmark it requires ~3.0
milliseconds, but in the overall code it is called 2000 times so the
function is taking ~6 seconds which feels long.

In this example, the value of k for i = 1 is 1:26, for i = 2, 1:27, etc., etc.

I tried recoding using cellfun

idx = cellfun (@(n) (1:n), num2cell ((wlen(1) - idxp(end)):(wlen(end)-1)),
'uniformoutput', false);
y = cellfun (@(i) fcn (x(i)), idx);

but this takes ~4.5 milliseconds to run.

Anybody have any good tricks to try?

If not, it makes me wonder whether a C++ function that did this mixed
indexing could be useful.  Another feature which is not implemented for
movfun is "includenan"/"omitnan".  The core need is similar to that above
in that I have a function handle, some linear indices, and some logical
indices (based on isnan()).

--Rik



Reply | Threaded
Open this post in threaded view
|

Re: mixed linear/logical indexing question

Olaf Till-2
On Wed, Dec 19, 2018 at 11:09:51AM -0800, Rik wrote:

> Recently, sliding windows of statistics functions (movXXX) were added to
> Octave.  All of these functions depend on a base function movfun.  As part
> of implementing movfun I used the profiler to examine the performance and I
> found a hot spot responsible for 70% of the run time.  Unfortunately, I
> couldn't see any way to improve things without jumping out of the Octave
> language in to C++.  I'm hoping someone else might look at this and have an
> idea.
>
> The code is below
>
> --- Code Start ---
> ## Apply "shrink" boundary conditions
> ## Function is not applied to any window elements outside the original data.
> function y = shrink_bc (fcn, x, idxp, win, wlen, odim)
>   N   = length (x);
>   idx = idxp + win;
>   tf  = (idx > 0) & (idx <= N);  # idx inside boundaries
>
>   n   = length (idxp);
>   y   = zeros (n, odim);
>   ## FIXME: This nested for loop accounts for 70% of running time.
>   ##        Given that "shrink" is the default Endpoint value this
>   ##        code needs to be reworked.
>   for i = 1:n
>     k      = idx(tf(:,i),i);
>     y(i,:) = fcn (x(k));
>   endfor
> endfunction
> --- Code End ---
>
> In my case, the following conditions are a suitable example
>
> +verbatim+
> fcn = @mean;
> x = randi (1e4, [1000, 1]);
> idxp = 1:25;
> win = (-25:25).';
> wlen = [51, 51];
> odim = 1;
> -verbatim-
>
> Indexing in Octave is insanely fast, but this is a weird case because it is
> not simply linear indexing [x(5)], nor is it logical indexing [x(x < 1)]. 
> It is a hybrid where the indices are linear (as would be returned from
> find()) but only some of them are valid so they need to be masked with a
> logical value.
>
> When I run this code repeatedly in a benchmark it requires ~3.0
> milliseconds, but in the overall code it is called 2000 times so the
> function is taking ~6 seconds which feels long.
>
> In this example, the value of k for i = 1 is 1:26, for i = 2, 1:27, etc., etc.
>
> I tried recoding using cellfun
>
> idx = cellfun (@(n) (1:n), num2cell ((wlen(1) - idxp(end)):(wlen(end)-1)),
> 'uniformoutput', false);
> y = cellfun (@(i) fcn (x(i)), idx);
>
> but this takes ~4.5 milliseconds to run.
At my system:

--- Code Start ---
k = 1:30;
x_k = x(k).';
inp = num2cell (1:25); # 1 : length (idxp)
tic;
cellfun (@ (id) mean (x_k), inp, "UniformOutput", false);
toc
--- Code End ---

takes roughly the same time as calling the original 'shrink_bc'
function with:

--- Code Start ---
tic;
shrink_bc (fcn, x, idxp, win, wlen, odim);
toc
--- Code End ---

(both times 32--33 ms, 5 times slower than at your system).

So neither indexing nor for-loops seem to be the cause, just the
repeated calling of the function 'mean'...

Olaf

--
public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net

signature.asc (849 bytes) Download Attachment