mixed linear/logical indexing question Classic List Threaded 2 messages On Wed, Dec 19, 2018 at 11:09:51AM -0800, Rik wrote: > Recently, sliding windows of statistics functions (movXXX) were added to > Octave.  All of these functions depend on a base function movfun.  As part > of implementing movfun I used the profiler to examine the performance and I > found a hot spot responsible for 70% of the run time.  Unfortunately, I > couldn't see any way to improve things without jumping out of the Octave > language in to C++.  I'm hoping someone else might look at this and have an > idea. > > The code is below > > --- Code Start --- > ## Apply "shrink" boundary conditions > ## Function is not applied to any window elements outside the original data. > function y = shrink_bc (fcn, x, idxp, win, wlen, odim) >   N   = length (x); >   idx = idxp + win; >   tf  = (idx > 0) & (idx <= N);  # idx inside boundaries > >   n   = length (idxp); >   y   = zeros (n, odim); >   ## FIXME: This nested for loop accounts for 70% of running time. >   ##        Given that "shrink" is the default Endpoint value this >   ##        code needs to be reworked. >   for i = 1:n >     k      = idx(tf(:,i),i); >     y(i,:) = fcn (x(k)); >   endfor > endfunction > --- Code End --- > > In my case, the following conditions are a suitable example > > +verbatim+ > fcn = @mean; > x = randi (1e4, [1000, 1]); > idxp = 1:25; > win = (-25:25).'; > wlen = [51, 51]; > odim = 1; > -verbatim- > > Indexing in Octave is insanely fast, but this is a weird case because it is > not simply linear indexing [x(5)], nor is it logical indexing [x(x < 1)].  > It is a hybrid where the indices are linear (as would be returned from > find()) but only some of them are valid so they need to be masked with a > logical value. > > When I run this code repeatedly in a benchmark it requires ~3.0 > milliseconds, but in the overall code it is called 2000 times so the > function is taking ~6 seconds which feels long. > > In this example, the value of k for i = 1 is 1:26, for i = 2, 1:27, etc., etc. > > I tried recoding using cellfun > > idx = cellfun (@(n) (1:n), num2cell ((wlen(1) - idxp(end)):(wlen(end)-1)), > 'uniformoutput', false); > y = cellfun (@(i) fcn (x(i)), idx); > > but this takes ~4.5 milliseconds to run. At my system: --- Code Start --- k = 1:30; x_k = x(k).'; inp = num2cell (1:25); # 1 : length (idxp) tic; cellfun (@ (id) mean (x_k), inp, "UniformOutput", false); toc --- Code End --- takes roughly the same time as calling the original 'shrink_bc' function with: --- Code Start --- tic; shrink_bc (fcn, x, idxp, win, wlen, odim); toc --- Code End --- (both times 32--33 ms, 5 times slower than at your system). So neither indexing nor for-loops seem to be the cause, just the repeated calling of the function 'mean'... Olaf -- public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net signature.asc (849 bytes) Download Attachment