Re: further profiling results

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: further profiling results

Rik-4
jwe,

The executive summary is that there seems to be a slowdown in the
performance of regular, and possibly indexed, assignment.

Starting with base script bm.toeplitz.orig.m

--- Code bm.toeplitz.orig.m ---
runs = 5;

cumulate = 0; b = 0;
for i = 1:runs
  b = zeros (620, 620);
  tic;
    for j = 1:620
      for k = 1:620
        b(k,j) = abs (j - k) + 1;
      end
    end
  timing = toc;
  cumulate = cumulate + timing;
end

## Total time
cumulate
--- End Code ---

------------------------------------------------------------

First test, remove any loop body.

--- Code bm.no_loop_body.m ---
...
    for j = 1:620
      for k = 1:620
        ## No loop body
      end
    end
...
--- End Code ---

Results: no loop body:
3.2.4 : .10605
6.0.0 : .15496

Comments: no loop body:
A 50% slowdown, but at 50 milliseconds, not significant compared to the 7
seconds seen for the original script.

------------------------------------------------------------

Test case for straight assignment.

--- Code bm.assign.m ---
...
    for j = 1:620
      for k = 1:620
        z = 13;
      end
    end
...
--- End Code ---

Results: assignment:
3.2.4 : 0.17247
6.0.0 : 0.96006

Comments: no loop body:
A 5.6X slowdown between versions.  This seems quite significant.

------------------------------------------------------------

Test case for assignment using 1 index.

--- Code bm.1idx_assign.m ---
...
    for j = 1:620
      for k = 1:620
        b(k) = 13;
      end
    end
...
--- End Code ---

Results: 1-index assignment:

3.2.4 : 1.3076
6.0.0 : 3.3534

Comments: 1-index assignment:
A 2.6X slowdown between versions, and the absolute magnitude at 2 seconds
is significant.  Note that even in 3.2.4 the step-up from scalar assignment
to matrix assignment is 7.6X.

------------------------------------------------------------

Test case for assignment using 2 indexes.

--- Code bm.idx_assign.m ---
...
    for j = 1:620
      for k = 1:620
        b(k,j) = 13;
      end
    end
...
--- End Code ---

Results: 2-index assignment:

3.2.4 : 2.2823
6.0.0 : 4.9126

Comments: 2-index assignment:
A 2.15X slowdown between versions, and the absolute magnitude is verging on
3 seconds which is significant.  This is also 1 second slower than the
1-index case for 3.2.4.  It would be worthwhile to check whether
performance is scaling linearly with number of indices (such as 3-D and 4-D
arrays).  This is also the baseline I use for further comparisons.

------------------------------------------------------------

Test case for single loop versus nested loops.

--- Code bm.1loop.m ---
...
    for j = 1:(620*620)
      b(j) = 13;
    end
...
--- End Code ---

Results: single loop:

3.2.4 : 1.3396
6.0.0 : 3.3557

Comments: single loop:
This is nearly identical to the results for 1-index assigment.  As such, it
doesn't appear that loops are the problem.  This is also corroborated by
the first result where taking out the loop body proves that the loops run
fast by themselves.

------------------------------------------------------------

Test case for arithmetic expression.

--- Code bm.arithmetic_op.m ---
...
    for j = 1:620
      for k = 1:620
        b(k,j) = k + 1;
      end
    end
...
--- End Code ---

Results: arithmetic op:

3.2.4 : 2.8724
6.0.0 : 5.9112

Comments: arithmetic op:
This is quite close to the results for 2-index assigment.  In fact, the
slowdown is 2.06X versus the 2.15X seen for 2-index assignment.  So it is
likely that all of the variance is due to the issues with assignment.

------------------------------------------------------------

Test case for function called with constant value.

--- Code bm.fcn_const_val.m ---
...
    for j = 1:620
      for k = 1:620
        b(k,j) = abs (13);
      end
    end
...
--- End Code ---

Results: function w/constant value:

3.2.4 : 4.8345
6.0.0 : 10.811

Comments: function w/constant value:
Slowdown is 2.24X.  The slowdown of 2.15X for 2-index assignment would
explain 2.15/2.24 = 96% of this.

------------------------------------------------------------

Test case for function called with 1 lookup.

--- Code bm.fcn_1lookup.m ---
...
    for j = 1:620
      for k = 1:620
        b(k,j) = abs (13);
      end
    end
...
--- End Code ---

Results: function w/1 lookup:

3.2.4 : 4.9847
6.0.0 : 10.979

Comments: function w/1 lookup:
Slowdown is 2.20X.  The slowdown of 2.15X for 2-index assignment would
explain 98% of this.

--Rik